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QUADRATIC FIELD 
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Abstract. We prove that the statistics of the period of the continued fraction expansion 
of certain sequences of quadratic irrationals from a fixed quadratic field approach the 
'normal' statistics given by the Gauss-Kuzmin measure. As far as we know, these are 
the first non-average results about the statistics of the periods of quadratic irrationals. 
As a by-product, the growth rate of the period is analyzed and, for example, it is shown 
that for a fixed integer k and a quadratic irrational a, the length of the period of the 
continued fraction expansion of k n a equals c'fc™ + o(fc( 1- T6) ra ) for some positive constant 
c'. This improves results of Lagarias and Grisel. The results are derived from the 
main theorem of the paper, which establishes an equidistribution result regarding single 
periodic geodesies along certain paths in the Hecke graph. The results are effective and 
give rates of convergence and the main tools are spectral gap (effective decay of matrix 
coefficients) and dynamical analysis on S'-arithmetic homogeneous spaces. 



1. Introduction 

1.1. Continued fractions. The elementary theory of continued fractions starts by as- 
signing to each real number x G [0, 1] \Q an infinite sequence of positive integer^ referred 
to as the continued fraction expansion of x (abbreviated c.f.e); namely to each number x 
corresponds a sequence {an(^)} ng N which is characterized by the requirement 

x = lim j (1-1) 

n~¥oo CLi H ; j 

a2H i 

1 

an 

We refer to the numbers the digits of the expansion as in (11. ip . When x is 

understood we usually write a; for the i'th digit of the c.f.e of x. 

Given a number x, it is natural to ask for information regarding the statistical properties 
of its c.f.e; that is, for any finite sequence of natural numbers w = (wi, . . . ,Wk) (referred 
to hereafter as a pattern) one is interested in the frequency of appearance of the pattern 
w in the c.f.e of x, or in other words in the value of the limito 

D(x, w) = lim {1 < n < N : w = (a n+1 , a n+k )} . (1.2) 



^^We shall completely ignore the rational numbers, which correspond to finite sequences as well as real 
numbers outside the unit interval, for which an additional integer digit ciq is needed. 

2 This correspondence is in fact a homeomorphism when N N is considered with the product topology. 
3 The limit does not always exist. 
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It turns out (as will be explained shortly) that this frequency exists and equals some 
explicit integral (depending only on to) for Lebesgue almost any x. 

To see this, note that the c.f.e correspondence x -H- {a n (x)} fits in the commutative 
diagram 

N N Z _ flN ( L3 ) 



[0, l]\Q-^[0, 1]\Q, 

where S(x) = {^} = - — [^J is the so-called Gauss map and a is the shift map 
<r(ai, a 2 , ■ ■ ■ ) = (a 2 , a 3 , . . . ). Gauss observed in 1845 that S preserves the measure given 
by 

v(A) = -4rr I —^—dx, (1.4) 

on the unit interval (this is the so called Gauss-Kuzmin measure). The map S is ergodic 
with respect to v which implies by the pointwise ergodic theorem (see for example |EW11[ 
§2. 6, §9. 6]) that for v (or equivalently Lebesgue) almost any x and any pattern w = 
(wi, . . . ,Wk), the frequency D(x,w) defined in fl 1.2ft exists. More precisely, if we let 

I w = {x e [0, 1] \ Q : w = (ai(x), . . . , a k (x))} (1.5) 

denote the interval consisting of those points for which the c.f.e starts with the pattern w, 
then the pointwise ergodic theorem tells us that the ergodic averages of the characteristic 
function of /„, converge almost surely to u(I w ); that is 

1 N ^ 

lim J2xi w (S\x)) = u(I w ), (1.6) 

i=Q 

for v almost any x. As the set of possible patterns is countable we conclude that for 
Lebesgue almost any x (11. 6p holds for any pattern w. It is straightforward to check using 
the commutation in (11 .3p that the limit in (II. 6p is equal to the limit in (II. 2p . 

1.2. Quadratic irrationals. By Lagrange's Theorem (see for example [EWlll §3.3]) the 
numbers x for which the c.f.e is eventually periodic are exactly the quadratic irrationals; 
that is real numbers which are roots of irreducible quadratic polynomials over the ratio- 
nals. For quadratic irrationals (which clearly form a Lebesgue-null set) it is clear that the 
limit in (II. 2p always exists and is different from the almost sure value of the frequency. 

1.3. General goal. In this paper we investigate the behavior of D(x, w) where x varies in 
some fixed quadratic field. We make the convention to consider x mod 1 instead of x. This 
influences only the O'th digit in the classical discussion on continued fractions and does 
not effect of course any statistical property of the c.f.e. Our approach manages to deal 
with sequences x n which are related arithmetically in a way that involves only finitely 
many primes (see for example Theorems I2.1|I2.7I and Remark 12.61) . Nevertheless, the 
discussion leaves many natural open questions, a few of which we state below (see §2.7p . 
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The following theorem demonstrates the flavor of our results, which will be stated below 
in increasing levels of generality, culminating in Theorem 14.91 

Notation 1.1. Throughout this paper we use the notation <C in the following manner: 
Given two quantities A, B depending on some set of parameters P, we denote A <C B if 
there exists some absolute constant c > (independent of any varying parameter) such 
that A < cB. Given a subset P' of the parameter set P, we denote A <Cp' B if there exists 
a constant cpi > 0, depending possibly on the parameters in P', such that A < cp'B. 

We denote by \I W \ the length of the interval I w defined in (11.51) . The following theorem 
follows from Corollary 12.21 as explained in Remark 12.31 

Theorem 1.2. Let a be a quadratic irrational, w = (wi, . . . ,Wk) a finite pattern of 
natural numbers, and p a prime number. We have that D(p n a,w) — > v(I w ) as \n\ — > oo. 
Moreover, the following effective estimate holds 

\D(p n a,w) - u{I w )\ < a , p \I W \ p 32 . 

1.4. The measures u a . Before ending this short introduction we adopt a slightly differ- 
ent viewpoint which will be more convenient to our discussion. As the c.f.e of a quadratic 
irrational is eventually periodic, it follows from (II .3p that the orbit {S n a} n ^ of a under 
the Gauss map, is eventually periodic with some period P a = {xi, . . . ,xg} C [0, 1] \ Q 
with S(xi) = Xi + i for all i < £ and S(xe) = x\. Let us denote by v a the normalized 
counting measure on the set P a . Note that with this notation D(a,w) = v a {I w ), and so 
Theorem 11.21 above could be restated as saying that v v n a converge in the weak* topology 
to the Gauss-Kuzmin measure v (with an explicit estimate on the rate of convergence). 
All convergence statements regarding measures will refer to the weak* topology. We some- 
time say that a sequence of measures // n equidistributes to a measure \i to mean that it 
converges to it. 

2. Results 

The results appearing in this paper divide into two. One portion deals with effective 
equidistribution of single periodic geodesies arising from a fixed quadratic extension of 
Q and the other portion deals with harvesting the conclusions of the first part in the 
theory of continued fractions. Though in order to describe the results regarding continued 
fractions we do not need too much preparation, the statement of Theorem I4.9[ which is 
the main result of this paper and belongs to the first portion, needs some preparations. 
We therefore describe in this section the precise results regarding continued fractions, 
but make the compromise to describe the results regarding periodic geodesies in a more 
'hands on' way and refer the reader to §4] for the more accurate state of affairs. 

2.1. Height and support. Given a rational number q — I 1 , where k±,k2 are co-prime 
integers, we define the height of q to be 

ht(g) = hk 2 . (2.1) 
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Given a set of primes S, we say that q is supported on S if all the primes dividing k±, &2 
are in S. 

2.2. The exponent Sq. Appearing in the results below is a parameter || < Sq < | whose 
exact value is not known (although according to the Ramanujan conjecture So = \)- The 
bigger it is the stronger the statements are and the best known lower bound for it to this 
date is 5o > ||; a bound given by Kim and Sarnak in the appendix of |Kim03j . The 
meaning of this parameter will be explained in §6.31 but for the moment we will simply 
mention that it has to do with certain complementary series representations of GL2 not 
appearing in the spectral decomposition of some unitary representation. 

2.3. Results regarding continued fractions. To the best of our knowledge, there are 
basically no results in the literature regarding the statistical evolution of the period of 
the c.f.e of quadratics (not on average) which is the subject of the results in this section. 
We note though that there is plenty of literature regarding the evolution of the length of 
the period (see §2.61 and Corollary 12. 5p . The following results are proved in §S] and all of 
them are consequences of Theorem 14.91 Recall that a function / : Y — > Z between metric 
spaces (y, dy), (Z,dz) is said to be K-Lipschitz, for k > 0, if for any j/1,1/2 eKwe have 
dz(f(yi), f(y2)) < ftdy(yi,y2)- We then say that k is a Lipschitz constant for /. 

Theorem 2.1. Let a be a quadratic irrational, S a finite set of primes, q a rational 
number supported on S , and e > 0. For any K-Lipschitz function f : [0, 1] — > C the 
following holds 

1 fdu- [ fdv qa < Qi5 , e max{||/|| TC ,4ht(g)-*+ e , (2.2) 
'0 Jo 

where \\f\\ 00 = sup{\f(t)\:te [0,1]}. 

When we use Theorem 12 .11 to try and estimate the frequency of a pattern in the period 
of the c.f.e of q ■ a we obtain the following corollary which we leave without proof. 

Corollary 2.2. Let a, S, q, e be as in Theorem \2.1[ For any finite pattern w = ui\ . . .Wk 
of digits, 

\D(qa,w)-u(I w )\ (/J" 1 ht(g)-§ +e . (2.3) 

The exponent in (12. 2p is cut in half in ( 12. 3 p as a result of the fact that xi w is n °t 
Lipschitz and one needs to use an approximation of it in order to apply Theorem 12.11 



Remark 2.3. Theorem 11.21 is obtained from the above corollary by taking q = p n , the 
Kim-Sarnak exponent 5q — ||, and choosing e = ^ so that — 1| + e = — ^. 

In particular, it follows from Theorem 12.11 that if q n is a sequence of rationals sup- 
ported on S with ht(g n ) — > 00, then u qn0l equidistributes to the Gauss-Kuzmin measure v. 
The following example which was essentially communicated to us by A. Ubis shows that 
one cannot expect such convergence to hold for a general sequence of rationals q n with 
ht(q n ) ->■ 00. 
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Example 2.4. Let D be a fundamental discriminant such that the negative Pell equation 
x 2 — Dy 2 = — 1 has an integer solution (see Lag80 , |FK10j for example). Whenever the 
equation is soluble, it corresponds to a fundamental unit eo = k x + n\\fT) in Q(\/T)) of 
norm —1 and in turn, the odd powers e 3 D correspond to infinitely many further solutions 
of the negative Pell equation (kj, rij). Fix such D and for odd j let a.j solve the equation 
x = 2k j + -. That is the c.f.e of atj is purely periodic with period of length 1 of digit 2k j 
(note that here we abuse the notation introduced above and we do record the digit). 

Solving for x in the above equation we see that ctj could be chosen to be kj + Jk 2 + 1. 



As (kj,7ij) solve the negative Pell equation for D we get <x,- = kj + rij\/ D which shows 
that the measures v n .^ are not converging to the Gauss-Kuzmin measure and in fact are 
atomic measures supported on single points. 

The following corollary discusses the growth rate of the period. It follows from the 
argument yielding Theorem 12.11 and is proved at the end of §HJ For a quadratic irrational 
a we denote by \P a \ the cardinality of the support of v a . Note that \P a \ is the length 
of the period of the c.f.e of a. Apparently, it was known already to Dirichlet that the 
length of the period of the c.f.e of 5 n \fE grows very quickly. In the Appendix of Lag80 



using the methods of Dirichlet, Lagarias shows that under some restrictive assumptions 
on a one has that for any integer k there exists a constant C for which, C— < \Pk n a\- 
Under some restrictive assumptions on a, Grisel [Gri98] proved a stronger estimate of 
the form C\k n < \Pk" a \ < C2k n . The following corollary strengthens these results in 
several respects. Without any restrictive assumption on a we establish that for a fixed k 
\Pk n a\ = c'k n + o(k ( - 1 ~ s ^ n ), where the constant d > depend on a and k, and 5 could be 
taken to be ^ similarly to Remark 12.31 

Corollary 2.5. Let a, S, q, and e be as in Theorem \2.1\ There exists an absolute positive 
constant c and and a positive function c a (q) which attains only finitely many values (as q 
ranges over the rationals supported on S) such that 



c a {q) 



IP I 

I - 1 ga | 



« ajS , e ht(g)"l +e . (2.4) 



ht(q) 

Moreover, if q n satisfies ht(g n )| ht(g n+1 ) (for example q n = k n ), then c a (q n ) is constant 
for large values of n. 

Remark 2.6. Consider for simplicity the case where S = {p} consists of a single prime. 
In the above theorems we considered the two sequences p ±n a, where a is a quadratic 
irrational. In fact, as explained in Remark I4.14[ corresponding results hold for other 
sequences. As an example, given a sequence {ji}°^ , where ji G {0, ...,p — 1}, if we 
define recursively a_i = a and a n+ i = " n+J " (i.e. a n = p~ n (a + YmZq jiP % ))i then the 
following holds: If p does not split in Q(a), then the estimates (12. 2p . (12. 3p . and (12. 4p 
with qa replaced with a n and ht(g) replaced by p n still hold. If p splits in Q(a) a more 
restrictive statement holds: If the sequence {ji} used to define a n is eventually periodic 
then the estimates (12. 2p . (12. 3p . and (12. 4p with qa replaced with a n and ht(g) replaced by 
p n still hold, but the implicit constants depend on the sequence {ji}. 
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The following amusing phenomenon comes out of our analysis. Recall that the group 
SL 2 (Z) acts on the real line by Mobius transformations. It is well known that for a£l, 
the orbit {7 ■ a : 7 G SL 2 (Z)} is characterized as the set of all real numbers whose c.f.e 
has the same tail as the c.f.e of a. 

Theorem 2.7. Let a,S be as in Theorem \2.1\ The implicit constants appearing in all 
the results of %2. § (namely in (12. 2p . (12.31) . and (12. 4p ) may be taken to be uniform for all 
the quadratic irrationals in {7 • a : 7 G GL 2 (Z)} if and only if all the primes in S do not 
split in the quadratic extension Q(a). 

2.4. Results regarding periodic geodesies. The tight connection between the theory 
of continued fractions and the geodesic flow on the unit tangent bundle to the modular 
surface is by now considered classical and dates back to E. Artin's 1924 paper [Ar t 8 2] . The 
proofs of the above theorems utilizes this connection and follows from a corresponding 
equidistribution theorem of certain geodesic loops in this space. Nonetheless, quite a 
bit of technical work is needed (see §l8f9]) to derive the above effective results regarding 
continued fractions from Theorems 12.81 below. 

For any quadratic irrational a, let a' denote its Galois conjugate and let us define 

9a = ^ J if a — a' > 0, and g a = ^ ^ j otherwise. (2.5) 

The homogeneous spac^ 

= PGL 2 (Z)\PGL 2 (M) 

is naturally identified with the unit tangent bundle to the modular surfaced (the subscript 
00 will become clear in £J3]). Via this identification the geodesic flow corresponds to the 
action from the right of the diagonal group {a^t) : t G E}, where cioo(^) = diag (e*, 1) G 
PGL 2 (IR). By Lemma S3 below, the point x a = PGL 2 (Z)g a G is a periodic point for 
the geodesic flow; let fi a denote the normalized length measure supported on the geodesic 
loop through x a . Finally, let denote the PGL 2 (IR)-invariant probability measure on 
Aqo. The following theorem establishes, for example, the equidistribution — » m^. 

Theorem 2.8. Let a be a quadratic irrational and S a finite set of primes, q a rational 
number supported on S, and e > 0. For any n-Lipschitz function if G L 2 (X OQ ,ms) one 
has 

ipdrrioo max{||y9|| 2 , k} ht(g) 2 + e . (2.6) 

In fact, as explained in Remark 14.101 Theorem 12.81 follows from the more general Theo- 
rem H]9j In a nutshell, Theorem 14.91 asserts that if one considers the p-Hecke tree through 
x a , then unless an obvious obstacle is present, when taking a sequence x n on the tree which 
drifts away from the root x a , the periodic geodesies through x n must equidistribute to 
moo. Moreover, the amusing phenomena referred to in Theorem 12.71 asserts that this 



This does not reflect the authors preference saying that the right action is the left one. 
5 See m 
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equidistribution is uniform in the distance from x n to the root if and only if the prime p 
do not split in the corresponding quadratic extension of Q. For a richer family of measures 
than fj, qa for which an estimate such as ( 12 .6p holds (in the spirit of Remark 12. 6p we refer 
the reader to Remark 14.141 

2.5. About the paper. The first version of this paper was much shorter and elementary 
but it only established the equidistribution fi p n a — >• (and its implication v v ^ a — > v) and 
lacked the effective nature that appears in the results presented here. The decision to write 
the current version has a few serious disadvantages. The first is that the underlying simple 
argument is somewhat hidden behind many technical preparations which are needed for 
the effective results (and thus making the paper longer), and the second is that it narrows 
down the readership, as sophisticated tools such as effective decay of matrix coefficients 
are used. In order to remedy this, we plan on writing a survey paper |AS] , in which only 
the 'soft' convergence fi p ™ a — > is proved. We expect this survey to be essentially self 
contained and approachable for any advanced graduate student. It should also serve the 
purpose of preparing readers with less background for reading the current paper. 

2.6. References to existing results. Although the question of the evolution of the 
c.f.e along arithmetically defined sequences in a fixed quadratic field is extremely nat- 
ural, we did not find too many relevant papers to cite. Some earlier works studying 
the statistics of the period 'in average' (and also not in a fixed field), were initiated by 
Arnold (see [Arn08j . |Arn07] . [LerlO] and the references therein). See also [P0I86] . Other 
works, mostly related to the length of the period, which the reader might find related, 
may be found for example in many of the papers of Golubeva (such as |Gol02j ) and 
in |Gri98j . [BL05] |MF93j . jCZOi] . |Coh77] . |Hic73j . [Kei] . Standing out in this context is the 
recent paper of McMullen which provides examples of sequences of quadratic irrationals 
in a fixed quadratic field with uniformly bounded c.f.e digits |McM 09j . We suspect that it 
should be very interesting to compare in detail how McMullen's results fit together with 
the results of the present paper. 

As for results regarding periodic geodesies the situation is completely different and we 
will not try to list below all the relevant earlier work that has been done in the subject. We 
do wish to comment though, that as will be explained in £|5l Theorems 12.8114.91 are closely 
related to the works of Benoist and Oh [BO07) and to Duke's Theorem [Duk88| . |EEMVj . 
In fact, the non-effective equidistribution fi qnCt — > (where the q n are supported on S 
and ht(g n ) — > oo) follows from [BO07, Theorem 1.1] (and from Duke's Theorem) by a short 
elementary argument as will be explained in §5.11 Both the works of Benoist and Oh, and 
Duke's Theorem deal with the equidistribution of certain collections of closed geodesies 
and the phenomenon that happens in our case is that the single geodesic corresponding 
to qa occupies a positive proportion of the collection (see Lemma IBTTj) . 

Although the non-effective equidistribution n qnCt — > can be deduced from known 
results as stated above, our argument is independent of them and moreover, as noted 
above, it may be adopted to give an essentially self contained proof of the non-effective 
result [AS]. 
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2.7. Some open problems. We list below a few questions which emerge from our dis- 
cussion and remain unsolved. Each of the problems below have a corresponding problem 
stated in terms of periodic geodesies on the modular surface. 

(1) Give satisfactory sufficient conditions on a sequence of rationals q n to ensure that 
for a quadratic irrational a, the sequence of measures v qn0l equidistribute to the 
Gauss-Kuzmin measure v. It might be interesting to replace the quantifiers and 
allow the conditions to depend on a. 

(2) Is it true that for a quadratic irrational a which is not a unit in the ring of integers 
of Q(ct), the sequence of always equidistribute to the Gauss-Kuzmin 
measure along the subsequence of n's for which a n is irrational (see [CZ04] ). Note 
that our results deal with the case a = \fd. 

(3) Let p n be an enumeration of the primes. Are there any quadratic irrationals a for 
which v Pna equidistribute to the Gauss-Kuzmin measure. 

(4) Is it true that for any quadratic irrational a there exist a sequence of distinct 
primes p n so that u Pn0l equidistribute to the Gauss-Kuzmin measure. 
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3. Preliminaries 

3.1. Notation. For a prime p we let Q p denote the field of p-adic numbers and by 
Z p the ring of p-adic integers. We sometimes denote Qoo = EL The set P = {oo} U 
{p G N : p is a prime} will be referred to as the set of places of Q - the primes being the 
finite places. 

Let S C P be given. Throughout we let Sf = S \ {oo}. We denote by Qs,Zs the 
product rings YlveS Q«> Ylves respectively (the latter makes sense only when oo ^ S). 
Let G denote either one of the algebraic groups PGL 2 ,PSL 2 . We denote Gs = G(Qs). 
We denote an element g G Gs by a sequence g = (g v )ves where g v is a 2 x 2 matrix over Q v 
(note the slight abuse of notation). If oo G S we usually abbreviate and write g = (goo,9f) 
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where gf denotes the tuple of the components corresponding to the finite places in S. The 
identity elements in the various groups are denoted by e with the corresponding subscript. 
Thus for example e$ = (e^, ef). 

We may view the group T s = G(Z[i : p G Sf]) as a subgroup of G s (embedded 
diagonally). If oo G S, it is well known that T s is a lattice in Gs- We denote by X s 
the homogeneous space Ts\Gs and by the G^-invariant probability measure on it. 
The real quotient X^ = F^G^ is a factor of X s in a natural way: Let denote the 
maximal compact subgroup of G^ which is the (projective) orthogonal group P0 2 (R) 
in the case of PGL 2 or PS0 2 in the case of PSL 2 . For a finite place p G P/ we let 
K p = G(Z P ). We then let K$ denote the product \\ ve gK v . If oo G 5", the double coset 
space Xs/Ks f = Ts\Gs/Ks f is naturally identified with X^. We denote by ix : Xs — > X^ 
the natural projection. 

Remark 3.1. In practice, given x = Ts{goo, 9f) G Xs with representative {goo,9f) such 
that gf G K Sf , the projection tt(x) is defined to be Toofi'oo- In other words, 7r -1 (r 0O (7 0O ) = 
{rs(<?oo, #/) : 57 ^ ^S/}- Another useful observation to keep in mind here is that two 
points X\ = T s (goo, gf), %2 = F s (goo,hf) are in the same fiber (that is n(xi) = n(x 2 )) if 
and only if the quotient gj x hf belongs to K Sf . 

The group Gs (and all its subgroups) act on X s by right translation. In particular, if 
T C S, we may view Gt (and its subgroups) as a subgroup of Gs and thus it acts on Xs- 
Note that it : Xs — > X^ intertwines the Goo actions. Of particular interest to us will be 
the action of the real diagonal group = {diag (e*, 1) : t G R}, the elements of which 
we often write as a^t) = diag (e*, 1). 

We say that an orbit xL of a closed subgroup L < Gs through a point x G X s is 
periodic if it supports an L-invariant probability measure. Such a measure is unique and 
we refer to it as the Haar measure on the periodic orbit. Compact orbits are always 
periodic. Given a measure ji on X s and g G Gs we let g*/i denote the pushed forward 
measure by right translation by g. This notation is a bit awkward as (gh)*/i = h*(g*ii). 
This will not bother us as we will only use commutative subgroups to push measures. 

The Lie algebra of G v will be denoted by q v and is naturally identified with the space 
of traceless 2x2 matrices over Q v . Similarly to the notation introduced above we will 
denote by Qs = ®vesQv the Lie algebra of Gs- A basic fact that we will use is that if 
S is finite and L < Gs is a closed subgroup then L contains an open product subgroup 
[\ veS L v which allows us to speak of the Lie algebra of L which will be denoted Lie (L). 
The exponential map exp^ : q v — > G v is defined for any place v by the usual power series 
and in fact, is only well defined for finite places on a certain neighborhood of 0. We 
denote its inverse by log^ (it is defined on a small enough neighborhood of e v ) and use 
the obvious notation exp s , log s to denote the corresponding maps from the corresponding 
domains in g s , Gs respectively. 

Given an element g G Gs and an element u (either of Gs or of Qs), we denote by u 9 the 
conjugation g^ 1 ug. If g is semisimple we denote by (gs)J s the weak stable subalgebra of 
Qs- It is defined as the direct sum of the eigenspaces (of the operator u >->■ u 9 ) of modulus 
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< 1 or equivalent ly 

(05)J S = {u G 05 : { uign) } n>0 is bounded in g s \ . 

For each place v we equip G v , q v with metrics in the following way: For v = 00 we start 
with an inner product on which is right Xoo-invariant and use left translation to make 
it into a left invariant Riemanian metric on which is also right -invariant. This 
Riemannian metric induces left Goo-invariant, bi-i^oo-invariant metric on G^. For a finite 
place v, we start with a bi- ./^-invariant metric dx v on K v (such that K v equals the closed 
unit ball around e v ) and make it into a left invariant metric on G v (which is also right K v - 
invariant) by setting d Gv (gi,g 2 ) = 2 if g^ l g 2 <£ K v and d Gv (g 1 ,g 2 ) = d Kv (9i 1 92,e v ). On 
the Lie algebra g v we take the metric given by d Sv (u, w) = max {\uij — Wij\ v : 1 < i, j < 2} 
where the indices i,j stand for the entries of the corresponding matrix. We usually denote 
the distance from G Q v by d Sv (u,0) = \\u\\ and refer to it as the norm of u. We define 
the metrics dc s , d ss on G$, Qs respectively by taking the maximum of the metrics defined 
above over the places in S. The metric d Gs induces a right-i^-invariant metric on X$ 
by setting d Xs (V s gi,V s g 2 ) = inf 7e r s d Ga (75-1, g 2 ). 

In a metric space (X, dx) we denote B*(x) the open ball of radius r around x. In case 
the space is a group, we denote by the corresponding ball around the trivial element. 

We finish this section by stating two basic facts in the form of lemmas for convenience 
of reference. 

Lemma 3.2. Let xH C X$ be a periodic orbit of a closed subgroup H < G$ with Haar 
measure 7], then for any g G G the translate xHg = xgH 9 is a periodic orbit for H 9 with 
Haar measure g*g. 



Lemma 3.3. Let h(t] 
d Goo (g,gh(t))< h(0) 



be a one parameter subgroup of G^. Then for any g G G c 



t, where 



h(0) is the norm of the derivative ofh(t) at the identity. 



4. The S-Hecke graph and the main theorem 

Throughout this section we use the notation introduced in §3]with the choice G = PGL2. 
We fix a finite set of places S containing 00. The space can be thought of as the moduli 
space of equivalence classes of 2-dimensional lattices in the plane M 2 up to homothety. 
We will refer below to a point x G X^ as a class; here, the class T^g is composed of the 
lattice spanned by the rows of the matrix g (which is well defined up to scaling) and all 
its homotheties. 

We begin by describing a setting which will put Theorem 12.81 in a context which will 
allow us to state a certain generalization of it. In a few words, we will fix a class x with 
periodic Aoo-orbit and consider a class x' on the S'-Hecke graph through x and prove an 
effective equidistribution statement regarding the periodic orbit x' as x' drifts away 
from x in the graph. 

4.1. Hecke friends. Given a class x G Xoo, we say that a class x' is a Hecke friend of x 
if one can choose lattices A x G x,A x > G x' such that A x > < A x . After fixing the lattice A x 
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there is a unique choice of G x' such that A x i < A x is primitive; that is, such that the 
index [A x : A x /] is minimal. We denote this minimal index by ind(x,x'). We say that x' 
is an Sf-Hecke friend of x if the index ind(x,x') is supported on 5*/. It is elementary to 
check that the Hecke friendship relation is an equivalence relation and that furthermore, 
if x,x' are Hecke friends then ind(x,x') = md(x',x). 

4.2. The graph. For a class x G X^ we define 

Qs{ x ) — W £ A^oc, : x,x' are S/-Hecke friends} (4.1) 

The set Gs{ x ) has the structure of a graphS We join xi, x 2 G Gs( x ) with an edge if there 
exists Aj G Xi such that Ai is a sublattice of A 2 of index p for some p G Sf (note that 
as p is prime this forces Ai to be a primitive sublattice of A 2 ). In this case we declare 
the length of this edge to be log(p). This induces a distance function on the graph which 
we denote dg(-,-) for which dg(x\,X2) = log(ind(xi, x 2 )). We will refer to x as the root 
of Gs(x) and call Gs( x ) the S -Hecke graph through x. Note that the possible values of 
ind(x, x') are exactly the heights ht(g) where q varies along the rationals supported on Sf 
or in other words, the integers supported on Sf. We abuse the language often and refer 
to these heights as admissible radii and denote by Sh( x ) = G Gs( x ) '■ hides') = h} 
the sphere of radius h around the root x. 

4.3. The sphere. For a rational q supported on 5*/ let us define 

«/<«) = (*., (J J ),-..,( I («) 

Given x G X,^ we wish to have a convenient algebraic description of the classes on 
the sphere Sh_(x) for admissible radii h. We obtain this description using the extension 
7r : Xs — > Xoo in the following way: Lemma 14.11 below shows that the various points on 
Sh{x) are obtained by choosing a lift y G 7r _1 (x) of x, and projecting ya/(h) via 7r back 

tO Xoo. 

Lemma 4.1. For x = T^g G I ro and h an admissible radius we have 

S h (x) = vr ({T s (g, 7 ) : 7 e Too} a/ (h)) (4.3) 
= 7r( 7 r- 1 (a;)a / (h)). 

Proof. Recall that the elementary divisors theorem attaches to any pair of lattices A x < 
A 2 in the plane, a pair of integers <ii,d 2 which are characterized by the following two 
properties: (1) the divisibility d 2 \d 1 holds, (2) there exists a basis t>i,t> 2 of A 2 such that 
diVi, d 2 i> 2 forms a basis of A x . Note that Ai is a primitive sublattice of A 2 if and only if 
the second divisor satisfies d% — 1. We conclude from here that given a class x = T^g, 
then a class x' lies on the sphere S\[x) if and only if there exists a lattice A x > G x' which 



When Sf contains only one prime, this is the well known p-Hecke tree through x. In general, this 
graph is the product of the various p-Hecke trees for p € Sf. 
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is a sublattice of the lattice A x G x spanned by the rows of g such that the elementary 
divisors are d± — h, d 2 = 1- in other words we have the equality 

S h (x) = {T^ diag (h, 1) 70 : 7 G T^} . (4.4) 

The following identity is crucial for us. It shows how the lattice Ts causes the desired 
interaction between the real and p-adic components in the extension X$ of X^: 

^ diag (h, l) ig = ix (r 5 (diag (h, 1) 75-, e f )) (4.5) 

r 5 7 _1 diag (l,h)(diag(h, 1)75-, e/) = vr (T s (g, 7~ 1 )a / (h)) . 



TT 



— — 

er s 



From equations ( I4.4p . (l4.5p we immediately conclude that 

S h = 7r ({T s (g, 7) : 7 G Too} a/(h)) , 

which is the first equality in ( 14. 3p . Using the first equality, the second equality follows 
once we show that for any given uj G Ks } there exist 7 G such that 7i(Ts(g, 7 )a/(h)) = 
7r(T s{g , cu) a f(h)) . A short calculation using Remark I3TT1 shows that this happens precisely 
when 

7 -1 w G af{h)Ks f af{h)-\ (4.6) 

Thus, let lo = (up) pe Sf G -^5/ be given and write u p = 9 P ■ diag (1, det(w p )) , with 9 P G 
SL 2 (Z P ). Let < SL 2 (Z P ) be the subgroup consisting of elements congruent to the 
identity modulo p n . By the strong approximation Theorem for SL 2 (sce[PR94, §7.4]), for 
any n G N there exist 7 n G such that for all p G Sf 

Note that there exist N = N(h) G N such that for all n > N we have that the image of 
ripeS/ U% m Gs f lies in a/(h)_ft's' / a/(h) _1 . As a/(h)ii"5 / a/(h)~ 1 is a group that contains 
(diag(l,det(o;p))) p6g we conclude that Y[ peSf U% ■ diag (1, det(w p )) C a/(h)if S/ a/(h) _1 . 
Therefore any 7„ with n > N will satisfy equation (14.61) . This concludes the proof of the 
lemma. 

□ 

Definition 4.2. Let x G be given. Let g x G be a choice of a representative for x 
so that x = Too^. For any choice u G Kg we define the generalized branch C gxjU1 C ^s(x) 
to be the set 

£g x ,ui = TT ({r5(5f a; , o;)a/(h) : h is an admissible radius}) . (4.7) 

When a; is a rational element (i.e. for any p G Sf the p'th component u p of a; satisfies 
oj p E K p n PGL 2 (Q)) we call the generalized branch C gxiU! a rational generalized branch. 

The reader should think of the generalized branches as prescribed ways to go to infinity 
in the graph Qs(x). When Sf is composed of a single prime the generalized branches are 
exactly the branches on the Hecke tree that start from the root x. 
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Remark 4.3. We wish point out a few things regarding the definition of generalized 
branches and fix some notation that will be used in the sequel. Let x = T^gx G be 
given. 

(1) For any uj G Kg, and any admissible radius h we denote = Ts(g x , ^) a /(h) G 
Xs, ^aj,h = Tt(yu,h) ^ -^oo- The generalized branch C gxtU! intersects the sphere 
Sh(x) in a single point, namely 

Xw,h = £g x ,w n S h (x). (4.8) 

When the generalized branch is fixed (that is when u is fixed) we sometime denote 
Xh = Xu,h.- We stress here the dependency on the representative g x of x. Note that 
we do not recall this dependency in the notation a^h, 2/w,h- 

(2) Two generalized branches C gxiUl , Cg x ,u 2 intersect the sphere Sh(x) at the same 
point, that is, x Ult h = x^^, if and only if the points y Wi ,h lie in the same fiber of 
7T. This is in turn equivalent to saying that the conjugation (u;^ 1 u;i) a ^ h ) lies in 
Ks f (see Remark 13. ip . This happens if and only if the lower left coordinate of 
each of the components of uj^^i is divisible by h in the corresponding ring Z p . 
In particular, it follows that it is divisible by any integer that divides h which 
means by the same reasoning, that the two branches intersect all the spheres 

at the same points, for any choice of admissible radius h' dividing h. Moreover, it 
follows from here that given ui,U2 G Ks f , the two generalized branches C gxjUH are 
identical if and only if the quotient u^Ui is an upper triangular element of K Sf . 

(3) From the above it follows that the collection of generalized branches may be iden- 
tified with the quotient K Sf /B, where B < K Sf denotes the group of upper 
triangular elements (this identification depends of course on the choice of the rep- 
resentative g x ). 

(4) If we replace g x by another representative ^g x for 7 G r^, then it readily follows 
that for any oj G K Sp £73^ = £g x ,~f- l u)- m particular, the notion of rationality of 
a generalized branch is well defined. 

(5) Let q be a rational supported on Sf and write q — g in reduced form (that 
is, £±,£2 are coprime). We let the type of q be the subset r q C Sf defined by 
r g = {p G Sj : p I £ 2 } • Thus, the rationals supported on Sf are partitioned into 

2l S/ 1 sets according to their types. We leave it as an exercise to the reader to show 
that for any class x = T^g G and any u G K$ f , the collection 

C T g uJ = 7r ({Ts(g, u)af(q) : q is a rational supported on Sf of type r}) 

is a single generalized branch and that the class iv(Vs(g, u)cif(q)) lies on the sphere 
<Sht(q){x)- Moreover, the notion of rationality of the generalized branch is indepen- 
dent of the type; that is, C T g w is rational if and only if uj may be chosen rational. 

4.4. Periodic Aoo-orbits. The following classical lemma relates the periodic Aoc-orbits 
to quadratic irrationals. 

Lemma 4.4. Let a be a quadratic irrational, g a G Goo be as in (I2.5p . and x a = T^ga G 
. Then, the orbit XaA^ C X^ is periodic. 
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Proof. Consider the Z-module A a = span z {l,a} in the field Q(a). There exists a unit 
uj in the ring of integers which stabilizes A a and furthermore by replacing uj by uj 2 if 
necessary we may assume that both uj and its Galois conjugate uj' are positive. Note that 

^ g J G GL 2 (Z) be 

the matrix describing the passage from the basis {1, a} to the basis {uj, wa} of A a . That 
is 

n m \ ( a \ _ ( 
k i )\ \ )~\u> 
The reader will easily verify now that (14. 9[) implies that ^g a = g a diag (uj, uj') or in other 
words that in the space Xoo the orbit XaAoo is periodic as desired. □ 



(4.9) 



Remark 4.5. In fact, it is well known (cf. |LW01| . [McM05] . |ELM V09] ) that any periodic 
Aoo-orbit in is of the above form. Given a periodic orbit xA^, we denote by F^ the 
corresponding quadratic extension of Q from which the periodic orbit xA^ arises (see 
also Remark 17.41 below) . 

Definition 4.6. Given x G I ro with a periodic A^-orbit we denote by t x the length of 
the period, i.e. the minimal positive t for which xa^t) = x. If x = x a (in the notation 
introduced after (12. 5p ). we denote this period by t a . We let \i x denote the unique A^- 
invariant probability measure supported on xA^ and similarly, when x = x a we denote 
this measure by \i a . 

Let x G Xqo be a class with a periodic A^-orbit. It is straightforward to argue that 
any x' G Qs(x) has a periodic orbit as well. We are interested in understanding the way 
the orbit x' A^ is distributed in as dg(x, x') goes to oo. 

Remark 4.7. It turns out that the answer to this question has to do with the question 
of whether or not the primes p G Sf split in the field F x . (see Remark 14.51) . Let 7 G 
be a matrix such that the roots of its characteristic polynomial generate ¥ x . Recall that 
a prime p splits in if and only if 7 is diagonalizable over Q p . A short exercise in linear 
algebra shows that 7 G is diagonalizable over Q p if and only if it can be triangulized 
over Z p . 

Definition 4.8. Let x be a class with a periodic Aoo-orbit and fix a representative g x so 
that x = Too^. Let j x G be the matrix satisfying j x g x = g x aoo(t x ), and let uj G Ks r 

(1) We say that the generalized branch C 9x ^ is degenerate (for Sf) if there exists 
p G Sf such that uj~ lr y^uj p is upper triangular for some positive integer n (here uj p 
is the p'th component of uj G i^5 / )Q 

(2) We say that the class x is split (for Sf) if there exists p G Sf which splits over F^ 
(or equivalently, by Remark 14. 7\ if there exists a degenerate generalized branch). 

We are now ready to state the main result of the present paper. 



7 This is equivalent to saying that the Lie algebra of the closure of the group generated by 7" in G p is 
upper triangular. 
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Theorem 4.9. Let x = T^gx G X^ be such that xA^ is periodic. 

(1) Let £ = Cg X)UJ be a nondegenerate generalized branch of the graph Qs{x) 



admissible radius, then for any K-Lipschitz function (f G L 2 (X 
e > the following holds 



(f dm c 



x 



<.x,S f) C,e max {lko|| 2 , K } n " 



2 



and h an 
and any 



(4.10) 



(2) If x is non-split (i.e. all generalized branches are non- degenerate), the implicit 
constant in (I4.10p may be chosen to be independent of the generalized branch and 
we have uniform rate of equidistribution along the full graph. 

(3) If x is split and Cg X)U} is a degenerate generalized branch, then there is a sequence of 
admissible radii h n — > oo such that for the sequence of classes Xh n , the lengths t Xh 
of the orbits x^A^ are bounded and in particular, the orbits do not equidistribute. 

(4) Rational generalized branches are always nondegenerate and so (I4.10p holds auto- 
matically. 

(5) Nonetheless, in case x is split, the implicit constants in (14.101) cannot be taken to 
be uniform for the rational generalized branches. 

Remark 4.10. Let a be a quadratic irrational and q a rational supported on Sf. Suppose 
for example that a > a' so that the matrix g a has the form that appears on the left hand 
side of (12.51) . We wish to analyze where on Qs{x a ) the class x qa lies. We have the following 
identity 



x qa = 7T (T s (g qQ ,e f 



)) 



(4.11) 



vr Tsa^iq 1 )a f (q 1 )(g ctl e f )a f (q) = 7T (T s {g a ,e f )af{q)) 



er s 



which shows that x qa lies on the rational generalized branch £gl, e/ on the sphere Sht(q)(%a) 
(see Remark 14.3115] ) ) . As only finitely many such generalized branches are involved we see 
that parts (II]) , (H]) of Theorem 14.91 indeed imply Theorem 12.81 



4.5. Walking on the tree. Let us consider for simplicity the case when 5*/ consists of 
a single prime p and let x = T^gx be a class in with a fixed representative g x . As 
noted earlier, the graph Q p (x) is a p + 1-regular tree. Let us denote 



Sn 



p 
1 



T 

J- i 



P-J 



1 3 
p 



,j = 0,l...p-l. 



(4.12) 



Furthermore, we set H = {S P ,T } 



p,0, 



5 Tp,p— i}- 



Definition 4.11. Let us refer to a sequence R\,...R n , Ri G H as legitimate if any 
appearance of the element S p does not follow the appearance of any element of the form 
T p j, and, any appearance of the element T Pi0 does not follow an appearance of S p . 
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A quick computation shows that the number of legitimate sequences of length n is 
(p + l)p n ~ l which is exactly the cardinality of the sphere S p n(x). The following lemma 
shows that this is not a coincidence. We leave the proof as an exercise to the reader. 

Lemma 4.12. There is an equality 

S p n(x) = {T^Rn . . . Rig x : Ri, . . . , R n is legitimate} . 

Furthermore, the infinite sequences {Ri} of the elements of H describe all possible paths 
on the graph Q p (x) that start from the root x and a sequence is legitimate if and only if the 
path it describes is a branch. Finally, if {Ri} is a legitimate sequence which is eventually 
periodic, then it corresponds to a rational branch. 

Let a be a quadratic irrational and x a = T^ga as before. Given a legitimate sequence 
{R n }^L , R n £ H, we consider the resulting recursive sequence of numbers 

a -i = a, a n = R n ■ a n -i n > 0, (4-13) 

where here R n acts on the real line as a Mobius transformation. Clearly a n G Q(a). 
We alternate between thinking of the elements R n of the sequence as describing a walk 
along a branch of the graph Q p {x a ) and describing a sequence of arithmetic operations 
which result in the sequence a n from (|4.13|) . We now explain the connection between 
these two viewpoints. Given h G PGL^Q) the reader should check that hg a is obtained 
from gh-a by multiplication from the right by a diagonal element. In our case, h is one of 
the elements of H and the calculation shows that this diagonal element lies in A^. The 
following lemma follows 

Lemma 4.13. Given a legitimate sequence {R n } corresponding to the branch C 9aiUJ C 
Q p {x a ), let a n be the resulting sequence of quadratic irrationals given in (14.131) . then 

Remark 4.14. Lemmas I4.12f4.13[ when combined with Theorem I4.9[ imply the ef- 
fective equidistribution of the periodic orbits x Q , n A 00 for more general sequences than 
just p ±n a that were considered so far. For example, given any sequence {jn}^L ' wnere 
] n G {0, ... ,p — 1}, a quick calculation shows that the sequence of quadratic irrationals 
a n corresponding to the legitimate sequence of operations given by {T PJn }, is given by 

n _ a + Ei= W 

lip does not split in Q(a), then Theorem 14. 91 (combined with Lemmas 14. 12|4. 1 3p imply the 
equidistribution fi an — > (in the effective manner given in (14.101) ). If p splits in Q(a), 
a more restrictive result holds; namely, in order to ensure that the corresponding branch 
is nondegenerate we confine ourselves to sequences a n arising from eventually periodic 
sequences j n . 
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5. Relations to other arguments 

Before turning to the proof of Theorem 14.91 we wish to make some comments that will 
clarify its relation to arguments giving equidistribution of collections of periodic orbits. 
The result of Benoist and Oh [B O071 Theorem 1.1] imply that given a class x with a 
periodic Aoo-orbit, then the collection of orbits {x'Aoo : x' G S^x)} (counted without 
multiplicities) is becoming equidistributed as h — > oo. 

Ignoring the effectivity of Theorem 14. 91 and just interpreting it as saying that fi x > — > 
as x' drifts away from the root x along a nondegenerate generalized branch, it seems 
tempting to think that it is considerably stronger than the result of Benoist and Oh, as it 
deals with the equidistribution of single orbits as opposed to the equidistribution of the 
full collection. We will show in §5. II below that this 'soft' equidistribution in fact follows 
quite elementarily from the work of Benoist and Oh. Nonetheless, as far as we know the 
effective statement does not follow easily from known results. 

5.1. Total vs. individual growth. Let x G 1^ be a class with a periodic A^-orbit and 
consider the union of the periodic orbits x'A^ for x' G Sh(x) (where h is an admissible 
radius). We denote the total length of this union by t x .(h); that is, t x (h) = ^i^' where 
the sum is taken over a set of representatives of the classes on the sphere giving rise 
to different orbits. The following lemma shows that the growth rate of the length of 
individual periodic orbits along a nondegenerate generalized branch is the same as the 
growth rate of the total length. To some extent this phenomena is what stands behind 
our results. We do not really need this lemma but we state it here because we think it 
explains the phenomena at hand in a clear way. Nevertheless, in the course of deducing 
Corollary 12.51 we will need a part of it and for completeness, we provide the proof in £0 

Lemma 5.1. Let x G be a class with a periodic A^-orbit and C a nondegenerate 
generalized branch ofQs{x). 

(1) The total length t x (h) satisfies h <^ x ,s f ^(h) ^x,s f h. 

(2) Let c(h) = cc(h) be the function defined by the equation t Xh = c(h) h ; where x^ is 
the class in fl C. Then, c(h) attains only finitely many values and moreover, 
if h n is a divisibility sequence of admissible radii (that is h n \h n+ i), then c(h n ) 
stabilizes. 

(3) The class x is non-split for Sf if and only if 

inf {cc{h.) : C nondegenerate, h admissible radii} > 0. (5-1) 

The first two parts of Lemma loTTl show that if £ is a nondegenerate generalized branch, 
then a single orbit x^A^ through the class x^ G C fl S^x) actually occupies a posi- 
tive proportion (bounded below by a constant independent of h) of the full collection 
{x'Aqo : x' G 5h(x)}. Relying on |BO07] we may argue the soft version of Theorem 14.91 
(that is, that fi Xh — > as h — >• oo) in the following way: Let hj — )• oo be a sequence 
of admissible radii such that fi Xh converges to say fi^ (which is an A^-invariant mea- 
sure). We need to argue that /ioo = moo. Let rj^ be the natural Aoo -invariant probability 
measure supported on the collection of periodic orbits {x'Aoo : x' G 5h(x)}. By [BO07] 
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% moo- By the first two parts of Lemma \h. II we can write 77^ as a convex combination 
of Aoo-invariant probability measures in the following way: 7]\ H = c' h .fi Xh + (1 — c^Ju^., 
where the constants d h . are bounded below by some constant d independent of hj. Tak- 
ing i to 00 (along an appropriate subsequences if necessary) we deduce that in the limit 
m oo — c'oo^oo + (1 — d^Uoo for some positive constant d^ < 1. By the ergodicity of moo 
with respect to the A^-action we deduce that the limit fj,^, that appears in the above 
convex combination with positive weight, must be equal to m.oo- This establishes the 
desired convergence. We note here, as mentioned above, that even if one starts with an 
effective statement regarding the equidistribution of the measures ^ supported on the 
collection of orbits, it is not clear how to use the above argument to deduce an effective 
statement such as in Theorem 14.91 regarding the equidistribution of a positive fraction of 
the collection. 

Remark 5.2. As noted in the introduction, in the above argument, instead of appealing 
to [BO07j . one can appeal to Duke's Theorem |Duk88j . jELMV] . 

6. Proof of Theorem 14.91 

Throughout this section we fix x G X^ to be a class with a periodic Aoo-orbit and 
a representative g x G Goo such that x = T^gx. Using the notation of Definition I4.6[ it 
follows that there exists 7 X G Too such that 

lx9xaoo{t x ) = 9x- (6.1) 

We briefly discuss the relations between the various parts of Theorem 14.91 As the eigen- 
vectors of 7 X are irrational (and not roots of unity) it follows that 7^ (or any of its powers) 
is not triangulizable over Q and so all the rational generalized branches are nondegenerate. 
Therefore part (j3j) of the theorem follows from part ([I]). Part (JSJ) of the theorem follows 
from part because of H4. 3|) which shows that any class on the S-Hecke graph Qs{ x ) hes 
on a rational generalized branch; the sequence x^„ produced by part Q may be viewed 
as a sequence of classes lying on (varying) rational generalized branches, showing that a 
uniform implicit constant for all rational generalized branches in (I4.10p is impossible. 

We begin with the necessary preparations for the arguments yielding parts ([I]),©, 
and ([3]). We will see below that part is a simple observation once the stage is set 
correctly and so the main bulk of the theorem lies in establishing parts ([1]) and (121) . 

After fixing g x we fix a generalized branch in Qg{x)] that is, we fix an element u G Ks f 
and set £ w = C gxjU . Although uj is fixed, the reader should bear in mind that at some 
point we will vary the choice of u in order to change the generalized branch. 

6.1. The lift of a closed loop. The following construction is fundamental to our argu- 
ment. Let y w G X5 be defined by = rg(g x ,u). Consider the orbit y^A^ C Xs and 
note that 

7r(y w ) = x, xAoo = 7r(^Aoo) = tt^^oo), (6.2) 
where the rightmost equality follows from the fact that xA^ is compact and the continuity 
of the projection ir. 
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We now analyze the closure y^A^. Each i e M can be written in a unique way in the 
form t = s + £t x for some s G [0, t x ) and i G Z. It follows from ( 16. ip that 

y u aoo{t) = r5(^a^ o (^)a 0O (s),a;) = V s {g x a 00 {s) ^ 1 x lo) = y^a^s),^ 1 ^)- (6.3) 

If we denote for an element 7 in a group H by ('j)h the cyclic group generated by 7 in 
H, then it follows from ( 16. 3ft that 

UwAoc = y^A^ x (u~ l ^ x u) GSf ). (6.4) 

Let 

H u = u~ 1 ('y x } Gs u = (u- 1 ~f x u) Gs . (6.5) 

Clearly, H w is a compact subgroup of K Sf . We let 

L L0 = A oo x H u . (6.6) 

Lemma 6.1. The orbit y^L^ is compact and 

y^Aoo = y w L w . (6.7) 

Proof. We first establish (16.71) . The inclusion D follows readily from (16. 4p . For the reverse 
inclusion, let t n G R be such that y^oo^n) — >n->-oo 2/ G y^oo- Let s n G [0, t x ),£ n G Z be 
as defined before (16. 3P ; that is t n = s n + £ n . By compactness we may assume without loss 
of generality (after passing to a subsequence if necessary) that s n — > s and uj~ lr y^ n u — > h. 
We conclude from ( 16. 3 p that 

y = limy^aooitn) = limy w (a 00 (s n ), u^j^u) = y„ (a^ (s) , h) G y u L u . (6.8) 

The fact that the orbit y^L^ is compact now follows from the fact that it is a closed set 
contained in 7r~ l (xA 00 ) which is compact by the properness of it. □ 

Remark 6.2. The above proof actually establishes a bit more: We have shown that in 
fact, 

y^A^ = y^L^ = {y^a^t), h) : t G [0, t x ),h G H u } . (6.9) 

Definition 6.3. Let rj^ denote the L^-invariant probability measure supported on the 
compact (and hence periodic) orbit y w L w . For an admissible radius h let 

Vuj,h = y u a>f(h), L a J {h) = L Wjh , H u ^ = iC (h) , 

and note the identity y w L w a/(h) = y^hA^h = y W; h(Ax> x Hu,h)- We denote the unique 
L W) h- invariant probability measure supported on the periodic orbit y u ^.L w ^ by 77^ • By 
Lemma 13721 we have that (a/(h))*77 w = 77^. Note that the notation y w ^ is consistent with 
the one introduced in Remark I4J3JJI]) . 

Lemma 6.4. Let h be an admissible radius and x^^ G C u nSh(x). Then, the pushed orbit 
y u L u aj{\i) = y^^L^^ projects to the periodic orbit x^^A^ and furthermore, the measure 
Vcv,h supported on it is a lift of 
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The above lemma puts us in a desirable situation from the dynamical point of view; 
instead of studying the orbits x'A^ in the space X^ as x' drifts away from i on a 
generalized branch (the connection between which is not clear apriori), we will study the 
images of the fixed orbit y u L w under the action of a/(h) for admissible radii h, which 
share a clear algebraic (and geometric) relation. This relation is the reason we needed to 
introduce the S'-arithmetic extension Xs- 

Proof. The fact that x W; h = 7r(y u ,h) follows from Definition 16.31 and Remark l4.3t|T] ). We 
have that 

7r(y w ,h^a;,h) = 7T (^^/(ll)) = 7r(l/ w A^ttf (ll) ) (6.10) 

= TriyudfQ^Aoo) = n(y UJjh )A 00 = x^A^, 

where first equality from the left follows from Definition 16.31 the second, from Lemma 16.11 
and that fact that a/(h) acts on Xs by a homeomorphism, the third, from the commu- 
tation of A^ and ct/(h) and from the continuity of 7r, the fourth, from the fact that tt 
intertwines the A^-actions on Xs, X,^, and finally the fifth equality follows from the fact 
that the orbit x^^A^ is compact. 

As Aoq < L Wj h, Vu,h is Aoo-invariant. As a consequence, the projection n^r]^^ is an Aoo- 
invariant probability measure supported on Xu hAoQ. As fi Xuh is the unique such measure, 
we conclude that ir*r)u,h = Hx u h as desired. □ 

Remark 6.5. It follows from f !6.9p and the definition of y^^, ^ that 

(t),h)-.te [0,t x ),heH Ujh }. 

By (I6.10P the following equality follows: 

Xu,h.A» = n ({yoj,h(aoo(t), h) :t G [0, t x ),he H Ujb }) . (6.11) 

The meaning of the above equation is that the only reason for the orbit x^^A^ to become 
long is that the group stretches and 'sticks out' of Ks r This is illustrated in the 
following proof. 

Proof of part ([3]) of Theorem \4-9[ For an admissible radius h and p G Sf denote by 
{Huj^)p the projection of the group H^^ on its p-th component. Note that by defini- 
tion, {Hu£) p = diag (l,h _1 ) {H w ) p diag (1, h). 

Assume that the generalized branch is degenerate. It follows that there exists p G Sf 
for which some power of the p-th component {u~ lr ) x u) p is upper triangular. Let d be the 
minimal positive integer for which (u~ 1, y x u) p is upper triangular. We conclude from (16 .5p 
that (H w ) p contains an index d subgroup that consists of upper triangular elements only. 
Choose h n = p n and note that because of the above (H^^Jjp fl K p is of index at most d 
in (H^ ^p. Moreover, note that as p is a unit in Z p / for any prime p' ^ p, we have that 
(if aJi h n )p / < K p i. It follows that along the chosen sequence h n we have that ^ n Ks f 
has at most index d in H^^. Let hi G H^^^i — 1 . . . d', dl < d, be representatives of 
the cosets of H K Sf and denote yi = y u .h n hi, i = 1 • • • d! and = 7r(?/j). We can 
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rewrite (16. lip as 

Xu^Aoa = ix (uf =1 {^, hn (a 00 (t), hih) : t G [0,^), h G F W) h„ n K Sf }j 

= vr (yd {yiMt), h)-.te [o,t x ), h e H^ hn r\K Sf }^j 

= ufL l {x i a oo (t):te[0,t x )}, (6.12) 
and so we conclude that t Xu hn < d't x which finishes the proof. □ 

In order to finish the proof of Theorem 14.91 we are left to argue parts (pQ) , (J2J) . As said 
before, these are the main parts of the theorem. 

6.2. Strategy of the proof of Theorem I4.9K |T]).(!2|). In the notation of Lemma WM, 
because 7r*?7 Wj h = [i Xujh , the validity of (I4.10p is equivalent to saying that given ip G 
L 2 (Xs,ms) which is i^-invariant (i.e. is of the form ipo o 7r for ipo G L 2 (X 00 ,m 00 )) and 
/t-Lipschitz, then 

<x,s f ,Cu,e max {re, |M| 2 }h 2 ■ (6-13) 

The argument giving this 'effective equidistribution' is a combination of an argument 
which we will refer to as the mixing trick and spectral gap (or effective decay of matrix 
coefficients). As far as we know the mixing trick originates from Margulis' thesis |Mar04] . 
We briefly describe its heuristics: One slightly thickens the initial orbit y^L^ to an open 
set T C Xs in directions which are (weakly) contracted by the action of a/(h). The 
set T will be called below a tube around the orbit y^L^. Let mj- denote the normalized 
restriction of mj to T. The pushed measure (af(h.))*mj- is the normalized restriction of 
m s to the pushed tube Ta/(h), which is a tube around the orbit y^.^L^^. Because the 
thickening used to construct T is taken in directions which are (weakly) contracted by 
a/(h), the size of the thickening giving the tube Taj(h) is even smaller than the size of 
the initial thickening. Hence, there shouldn't be much of a difference between integrating 
against the measure r/^^ and integrating against (a/(h))*mT. The fact that the action of 
a/(h) is mixing on (Xs,ms) means that the pushed measure (a/(h))*mr is 'close' to mj 
(here, the effective mixing Theorem 16.61 will allow us to pin down the meaning of 'close' 
in a precise way). Combining these things together will give us the desired estimate given 
in §J3§. 

In order to make this strategy into a rigorous proof we discuss in the next two subsec- 
tions in detail the construction of tubes and decay of matrix coefficients. 

6.3. Effective mixing. Let H = L 2 (Xs,ms)- Our goal in this section is to prove the 
following: 

Theorem 6.6. Leth be an admissible radius andw\,W2 EH be vectors with the following 
properties: Wi is K Sf -fixed and w 2 is stabilized by a product subgroup K* = Yl v es < 
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Kg of index d in K$ f ■ Then for any e > 0, 

| ( Wl , a f (h)w 2 ) - (w u 1) (1, w 2 ) | < e \\ Wl || |K|| d l 2 h- 5o+e , (6.14) 

The meaning of the exponent 5q that appears in (I6.14p will be explicated shortly. Before 
turning to the proof of the above theorem, we need to discuss three lemmas. For v G S 
let % v denote the orthocomplement of the G„-invariant functions in H. The following 
is |VenlOt Lemma 9.1]. It is the key input in the proof of Theorem 16.61 

Lemma 6.7. Let Wi,w 2 G 7i v (v G Sf) be two vectors which are stabilized respectively by 
finite index subgroups of K v , let di = [K v , K®], and a v (t) = diag (1, t) ,t G Q*. 

Then the following holds 

\(wi, a v (t)w 2 )\ ||wi|| ||^2|| d\d^ max [\t\ v , j^ 1 ^} 5o+e . (6.15) 

The exponent 5$ comes from the following discussion. Let p v be the unitary repre- 
sentation of G v on 7i v . Let o"o be the smallest number so that no complementary series 
representations of parameter > cr is weakly contained in p v . Here we follow |VenlOj and 
parametrize the complementary series representations by the parameter a G (0, |); so 
o"o = corresponds to p v being tempered (the Ramanujan conjecture) and o"o = \ corre- 
sponds to p v having no almost invariant vectors. The best bound known today towards 
Ramanujan is given by Kim and Sarnak in the appendix of |Kim03] and establishes the 
bound do < ^. The exponent 5o that appears in Lemma [6771 and that appears in our 
results is defined by 

£o = 2 -cr o, (6.16) 

so the Kim-Sarnak bound reads as 5 > ||. 

Lemma I6T71 is stated for one place v G Sf but in Theorem 16.61 we wish to take advantage 
of the various places h is supported on. In order to do this, we will need to use Lemma I6T71 
iteratively and the following abstract lemma in Hilbert space theory allows us to do so. 

Lemma 6.8. Let G = G\ x G 2 be a group acting unitarily on a Hilbert space H. Let 
Ki < Gi be subgroups, gi G Gi be two given elements, and F(g,j) two positive numbers 
satisfying the following statement: For each i, if v,w G 7/ are Ki-fixed vectors, then 

(giV,w) < \\v\\ \\w\\ F(gi). 

Then for any v , w G H which are K\ x K 2 -fixed we have that 

(gig 2 v,w) < \\v\\ \\w\\F(g 1 )F(g 2 ). 

Proof. Let us denote for i = 1,2 Vi = {v ET-L : v is i^-fixed} and U = V\ H V 2 . Let V- 
denote the orthocomplement of U in Vi and denote for a subspace W of % by P\y the 
orthogonal projection on W. We first note that Vi,V 2 are K 2 , ii'i-invariant respectively 
(because Ki,K 2 commute) and so the projections Py t ,Py 2 commute with the actions of 
K 2 , Ki respectively. It follows from here that given v\ G Vi, say, the projection Pv 2 {y\) is 
fixed by both K\ and K 2 i.e. Py 2 (vi) G U. This proves that V{ is orthogonal to V 2 or in 
a more symmetric manner, V{ is orthogonal to V^. 
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Let now v, w be two K\ x fT 2 -fixed vectors. As g\V is .fr 2 -fixed, i.e. g±v G V 2 , we may 
write g\V = Pu(giv) + Pv'{g\v) and similarly g 2 w = Pu(g2w) + Pv{{g2w). It follows that 

(giv,g 2 w) = (Pu(giv) + P v ^(g 1 v),P u (g 2 w) + P v ±{g 2 w)) 

= (Pu{ gi v),Pu{g 2 w)) < \\Pu(giv)\\ \\Pu(92w)\\ . (6.17) 

Let v = ir^feljj] ■ Then v is i^-fixed and so by the assumption of the lemma we conclude 
that 

\\Pu(giv)\\ = (giv,v) < \\v\\F{ 9l ). 

Similarly, ||P[/(g 2 w;) || < ||u>|| F(g 2 ). Plugging this into ( 16.171) yields 

(9iv,92w) < \\v\\ \\w\\F 1 (g l )F 2 (g 2 ), 

which is equivalent to the desired statement up to replacing g 2 by its inverse (note that 
the assumption on gi implies the corresponding assumption on g^ ). □ 

The final ingredient needed for the proof of Theorem 16.61 is the following 

Lemma 6.9. For each place v G 5* the group generated by G v and K$ f acts ergodically on 
Xs, that is, {w G "H : w is both G v , Ks f -fixed] is the one dimensional space of constant 
functions. 

Proof. Let Y s = SL 2 (Z[jr 1 : p G Sf])\ Ylves SL(Q„). The strong approximation property 
for SL 2 implies that for any v G S the lattice SL 2 (Z[p~ 1 : p G Sf}) embeds densely in 
Ili/es\{?)} SL 2 (Qt,/). This is equivalent to saying that SL 2 (Q„) acts minimally on Ys (i.e. 
that any orbit is dense). In turn, this implies that SL 2 (Q.y) acts ergodically on Ys (by the 
duality trick for example). Now, consider the natural map ip : SL 2 — > PGL 2 . This map 
induces a map from Ys to Xs (which we also denote by ip) which intertwines the actions 
of SL 2 (Q„) and ^(SL 2 (Q-y)) < G v on these spaces respectively. It follows that the action 
of ip(SL 2 (Q v )) on ip(Y s ) is ergodic. 

Let w G "H be a function on Xs which is both G v and i^-invariant. Its restriction to 
ip(Ys) is constant by the ergodicity proved above. It follows that in order to show that w 
is constant it is enough to show that the translates of ip(Y s ) by K Sf cover X s . We briefly 
sketch the argument: There is a natural 'determinant map' det : Gs —> Ylves 0-1/ (Ql) 2 - 
Let us denote A = H veS Ql/(Q* V ) 2 and A' = det(r 5 ) < A. It follows that there is a well 

defined map det : T s \Gs = X s — > A'\A. We leave it to the reader to show that the 
space ipiYs) is characterized as the preimage of the identity coset A' under det. Since det 
takes K Sf onto A'\A, we conclude that indeed, translates of ip(Y s ) under K Sf cover X s 
as desired. □ 

Proof of Theorem \6.6l Let % = L 2 (Xs, ms) and for v G S let T-L v be the orthocomplement 
to the GVfixed vectors. Let "Ho = ^veSr'Hv and let wi,w 2 G "H be as in the statement of 
the theorem. Write 

Wi = P Ho (Wi)+P n ±(Wi), 
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and note that the decomposition "H = T-Lq + Hq is G^-invariant. It follows that 

(a f (h)w 1 ,w 2 ) = (o/(h) (P« (^i) + P H x(w 1 )) , P Ho (w 2 ) + P H x(w 2 )) 

= (a f (h)P Ho (w 1 ),P Ho (w 2 )) + (a f (h)P K (w 1 ),P K (w 2 )). (6.18) 
s v ' s v ' 

(*) (**) 

Let us first argue that (**) = (wi, 1) (1, w 2 ). The space is the space generated by 
{^■v} ve s ' This implies that the vector P^±(wi) is in the span of the vectors P H x(wi) 
as v runs through Sf. For each v G S/ the vector P w ±(w 1 ) is both G v and i^-fixed 
and so by Lemma 16791 this implies that P H x(wi) G H c , where % c denotes here the 1- 
dimensional space of constant functions. We conclude that P H ±(wi) G H c , or in other 
words, P H ±(wx) = P% (w\) = (wi, 1). Using this we see that 

(**) = «uM),P«x(u>a)) = (w 1 ,l)(l,P n (w 2 )). (6.19) 

In turn, (P H x(w 2 ), 1) is the orthogonal projection of P^ i x(w 2 ) on 7i c , but as 7i c C Hq, this 
projection equals (w 2 , 1). We conclude from (16.191) that (**) = (wi, l)(l,w 2 ) as claimed. 

We now analyze (*) in (16.18H . Because the decomposition T-L = Hq + I-Lq is Gs- invariant 
the vectors Py_ (wi), Pu (w 2 ) are fixed under Ks f , K* respectively (where K* is as in the 
statement of the theorem). Order the primes in Sf in some way pi - . -Pk and denote 
d Pi = [K Pi : K*.], so d = [Ks f '■ K*\ = Yli=i d Pi - We leave it to the reader to prove by a 
simple induction, using Lemmas 16.71 16.81 that for j = 1, . . . , k 

3 3 k 

(Y[a Pi (h)P no (wi),Pn (w 2 )) « e ||P Wo K)|| ||P % K)|| f[4 J[ |hf/ 0+£ . (6.20) 

i=l i=l i=l 

In particular, for j = k we obtain 

(*) = (a f (h)P Ho ( Wl ),P Ho (w 2 )) < e H^ll |K||^h- ao+e . (6.21) 

Equations fl6.2ip . fl6.18p and the analysis carried above for (**) now imply the validity of 
the theorem. □ 

6.4. Tubes. As explained in §6.2[ we start with a Kg. -invariant 'test function' (p and 
we need to thicken the orbit y^L^ to a tube T and then apply ( 16. 14ft to the vectors 
W\ = ip,w 2 = XT- I n order for the use of (16.141) to be meaningful we need to control 
d which is the index of the stabilizer of the tube in Kg.- Also, the 'width' of the tube 
(i.e. the size of the thickening of the orbit) should be very small (at least in the real 
component) in order for the heuristics of §6.21 to take effect. This will hopefully motivate 
the constructions in this subsection. 

Definition 6.10. Let yL C X$ be a compact orbit of a closed subgroup L < Gs- Let 
V = (BvesVv be a linear complement to Lie(L) in Qs- Let U C V be a small enough 
open neighborhood of so that the map yL x U — > Xs defined by (z,u) h-> zexp s (u) 
is a homeomorphism onto its image and its image is open in Xs- The set Tu{yL) = 
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{zexp s (u) : z G yL, u G U} is called a tube around the orbit yL of width U. We often 
denote the tube simply by T. The width U and the tube T are said to come from V . 

A tube Tu(yL) gives us a coordinate system; a point of T can be written uniquely as 
zexpu. We refer to z as the orbit coordinate and to u as the width coordinate. We shall 
need a few lemmas about tubes which we now turn to describe. 

6.4.1. Measures on tubes. Given a tube T = Tu{yL) around the compact orbit yL coming 
from V, one could construct the following two natural probability measures supported on 
T. The first is the normalized Haar measure m ^ r ^ ms\r which we will denote by m-j. 
The second is the (pushforward of) the product measure 77 x mjj on yL x U ~ T, where 77 
is the unique L-invariant probability measure on the orbit yL and rrijj is the normalized 
restriction of the Haar measure on V to U (that is m\j = ^rm 'i n v\u)- We shall need to 
understand to some extent the connection between these two measures. 

Lemma 6.11. The measure m-j- is absolutely continuous with respect toi]xmjj. Moreover, 
if we denote by F(z, u) the Radon-Nikodym derivative; that is dmj- = F(z, u)drj(z)dmu(u) , 
then for r]-almost any z G yL, F(z, u)dmu(u) = 1. 

Proof. The absolute continuity is left to be verified by the reader. As for the claim about 
the density F, we argue as follows. Let f{z) = j v F(z,u)dmu{u). We will show that is 
constant //-almost surely. As f L ip(z)di](z) = mj-(T) = 1 this constant must be equal to 
one. 

Choose a fundamental domain £ in L for the orbit yL and identify it with the orbit. Note 
that with this identification r\ is just the restriction to £ of a Haar measured on L scaled 
so that rj(£) = 1. Assume to get a contradiction that (p is not constant 77-almost surely. It 
follows that there are constants C2 < C\ so that the sets E\ = {h G £ : f(yh) > Ci} , E 2 = 
{h G £ : <p{yh) < C2} are of positive 77-measure. There exists h G L so that ^(Ex H 
hQ 1 E 2 ) > and so if we let Ei = Ei fl h,Q 1 E 2 and E 2 = h Ei, then Ei C £ are both of 
(the same) positive ^-measure and differ from one another by left translation by ho. The 
following calculation derives the desired contradiction: 

Cii]{Ei) < / ip{z)dr)(z) 

= m s {Ei exp s {U)) = ms^h^ exp s (U)) = m s {E 2 exp s {U)) (6.22) 

= / (p(z)dr](z) < c 2 r](E 2 ) = c 2 r](E l ). 
Je 2 

□ 

Our aim now is to define the relevant family of tubes around the orbit y u L u that will 
be of use to us. The first stage is to choose the correct linear complement from which the 
tubes will come. 



'Note that L must be unimodular, hence this measure is both left and right invariant. 
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6.4.2. Choosing the linear complement. When we come to argue the validity of Theo- 
rem , © for a given admissible radius h, we may replace the set of primes Sf by the 
smallest set of primes on which h is supported on. Hence, without loss of generality we 
may (and will) assume that h is divisible by all the primes in Sf. We refer to such a radius 
h as having full support. The assumption that an admissible radius has full support is 
equivalent to the fact that the weak stable algebra of a/(h) attains the form 

(0s):; (h) = Boo ® peSf [ ( * * ) G 0P } . (6.23) 
Definition 6.12. Let V = ©„ e sK be defined as follows 

^ = {(2 S) G0 4 ; ForpeSf > Vp= {{o l) G0 4' 

Lemma 6.13. If the generalized branch C w is nondegenerate then the subspace V C 
Qs from Definition [SUM is indeed a linear complement of Lie(L u ) which is contained in 
(Qs)af(h) f or an V admissible radius h of full support. 

Proof. The fact that V C (fls)™^) follows from the discussion preceding Definition 16.121 

Recall that = Aqo x H w where H u = w _1 (7 :r ) Gs oj (see (I6.5p . (l6.6p ). Writing Lie(L a; ) = 

(Bsh we see th & t Voo indeed complements l^. Let T be the algebraic subgroup of G 
defined as the Zariski closure of the group generated by j x . It is a one dimensional torus 
and H w is a compact open subgroup of the conjugation w _1 T(n 5 ^ Z p )u. It follows that 
for any v G Sf the dimension of i v is 1 and so in order to argue that it complements V v 
we only need to argue that the inclusion l v C V v does not hold. Such an inclusion would 
imply that there is a neighborhood of the identity in that consists of upper triangular 
matrices, which in turn would imply that a certain power of uj~ l ^ x u is upper triangular, 
contradicting the assumption that the generalized branch is nondegenerate. □ 

Henceforth, when speaking about a linear complement V to Lie(L w ), we shall refer only 
to the subspace from Definition 16.121 

Remark 6.14. Because of the inclusion V C (fls)^(h) (f° r an y admissible radius h of full 
support), we conclude that if U C V is a small enough ball around zero, the conjugation 

UQ f ^ will be included in the domain of exp s . This implies that for any U C Uq the 
identity (exp 5 U) a f {h) = exp 5 (U a f ill) ) holds. It follows that if T is a tube of width U 
coming from V around y^L^, then if the width U is chosen within Uq, the pushed tube 
Tci/(h) satisfies 

Ta f (h) = y^Lu exp s {U)a f {h) = y Wjh L w , h exp 5 (f/ a ^ h >) . (6.24) 

That is, Ta/(h) is a tube of width f/ a /( h ) around the compact orbit y^hAj.h- Below, we 
will make the implicit assumption that all the widths considered are contained in the ball 
U . 
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6.4.3. The tubes . As explained above, we will need to construct tubes with shrinking 
real width component and with control on the subgroup of K$ f that stabilizes them. 
After describing this family of tubes we state a few lemmas that describe their relevant 
properties. The proofs of these lemmas will be postponed till after concluding the proof 
of Theorem 14.91 

Let us denote by B a compact open subgroup of the group of upperr triangular elements 
in Ks f that lies in the domain of log S/ and for which Remark 16.141 applies (that is, all 

conjugations B a f( h } are in the domain of log^ for admissible radii h of full support). 

For 5 > let Bj 00 be the ball of radius 5 around in the oo-component of the linear 
complement V from Definition 16.121 

Lemma 6.15. There exists 5 > and an open compact subgroup B = Yls f °f B, such 
that for all 5 < 5, if we let U 5 = Bj°° x \og s (B), then for any u G K$ f such that the 
generalized branch is nondegenerate, the set T£ = VujL^ exp s (U s ) is a tube around 
VuLui that is, the map y^L^ x U 5 — >■ Tjj is a homeomorphism and the set Tjj C Xs is 
open. Furthermore, the choice of B,5 depends only on the original class x and the set of 
places S at hand. In particular, they are independent of u. 

Lemma 6.16. Let B, 5 be as in Lemma \6.15\ and let u> G K$ f be such that the generalized 
branch is nondegenerate. 

(1) There exists an open compact product subgroup K* = Y\ s K* < K$ f which stabi- 
lizes the tube for any 5 < 5; that is Tjk = for any k G K*, 5 < 5. Moreover, 
if x is non-split, we may choose K* to be independent of u. 

(2) The measures ms(7~j) satisfy rn s {T^) ^> x ,s f ,£ u °~ 2 '■ If x is non-split, the implicit 
constant may be chosen to be independent of the generalized branch. 

6.5. Concluding the main part of the proof. 

Proof of parts flTJ , (J2]) of Theorem \4-9\ We follow the strategy presented in §6.21 and use 
freely all the notation introduced so far. Let h be an admissible radius and assume 
without loss of generality that it is of full support. Let <p G L 2 (A DO , m^) be a /t-Lipschitz 
function. We let <p = ip Q o it be the lift of (p to X S - As f x ip^dm^ = f Xs (pdm s , we see 
by Lemma 16.41 that part (JTJ of the theorem will follow once we prove 

<pdr) UtiL - pdms <C a;) 5 /) /; Wie max{||^|| 2 ,«;}h 2 + £ . (6.25) 

Part ([2]) will follow once we establish that in the non-split case, the implicit constant 
in (I6.25P does not depend on the generalized branch. Let V < g,s be the linear complement 
from Definition 16.121 We apply Lemma 16.161 and use the notation introduced there to 
obtain a family of tubes around y^L^ coming from V. 

We denote Tj h the (pushed) tube 7^fa/(h) around the orbit y^^L^^, and m T s the 

normalized restriction of ms to 7^f h . The width o^ T^ h is U s,h = (U s ) af ( h \ where U s is 



The reader should not confuse the superscript S with our notation for conjugation. 
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as in Lemma [6.151 (see Remark 16 .14p . We have 



x s 



(fdrris 



X S 



< 



x s 



if dm 



+ 



ipdm-j 



Xs 



(6.26) 
ipdms ■ 



Xs 



(*) 



— — 

• - : 



To estimate (*) we define ps,h '■ Tjh — C by ^ ]h (zexpM) = cp(z) for z G y^^L^^u G U s,h 
and extend it to be zero outside the tube T^ h to obtain a function on Xs- By Lemma f6. Ill 
it follows that 



0s,hdm 



>X S — Jyu,hLu,,h JU S * 

We therefore have the following estimate for (*) 



ip(z)F(z, u)dm,us,h (v^drju^iz) 



ipdr] u 



(6.27) 



x s 



(ps,hdm T s 



X S 



pdm T s 



Xs 



< max {\<p(y exp(w)) - <p(y)\ : y G y^L^w G U ' h } . 



(6.28) 



Note that if we write w G U > as (woo,w/), then for any y G ^.hA^h, <^(l/exp s (w)) = 
ipo(ir(y) exp 00 (w 00 )) by the i^-invariance of cp. As the maps induced by the actions of 
elements of the form exp 00 (w 00 ), ||u>oo|| < 1 are all Lipschitz with some uniform Lipschitz 
constant ci, the distance between 7i(y) and ir(y) exp 00 (w 00 ) is < c\8 and so by the Lipschitz 
assumption of po we obtain 

(*) < ctKS. (6.29) 
We now estimate (**). Let % = L 2 (X s ,ms) and denote 

1 



Wi = Lp, w 2 



In order to appeal to Theorem 16.61 we observe that w± is Ks,-&xed and w 2 is if* -fixed, 
where K* is as in Lemma 16.161 By Lemma 16.161 the index d = [Kg. : K*\ depends only 
on x, Sf, and and in the non-split case could be bounded by a number independent 
of the generalized branch. As for the norms, ||iui|| = ||^||, and for w 2 we have ||u>2|| = 
m s(Tj)~^. By Lemma EHH] we have that m s (T*) > x ,s f A* S 2 and so ||w 2 || < x ,s f ,£„ 
Furthermore, in the non-split case, the implicit constant can be taken to be independent 
of the generalized branch. 

It now follows from Theorem 16.61 that 

(<p, dfih)- 1 [ - ) xt£ ))- I ¥ dm s (6.30) 



m S (T* 

\(a f (h)w 1 ,w 2 ) - (w 1 ,l)(l,w 2 )\ 



X s 



<x,S f ,C u ,e II V II 2 5 1r 



S +e 



ON THE EVOLUTION OF CONTINUED FRACTIONS IN A FIXED QUADRATIC FIELD 29 



and that in the non-split case the implicit constant can be taken independent of the 
generalized branch. Combining ( I6.26p .( l6".29p .( l6.3u]) . and choosing 5 = ch2 ( ~ 5o+£ ) (the 
meaning of c will become clear in a moment) we obtain (16. 25ft as desired (with e replaced 
by ~). Here the constant c is chosen to protect us from the possible finitely many h's for 

which the inequality h.2^~ 6o+ ^ < 5 does not hold (5 as in Lemma \6. 151) . Note that indeed, 
the constant c depends only on 5, Sf, and e. By Lemma 16.151 we see that it actually 
depends on x, Sf, and e. This concludes the proof of Theorem 14.91 □ 

6.6. Proofs of Lemmas 16.15116.161 We shall need the following auxiliary lemma which 
we leave without proof 

Lemma 6.17. There exists a neighborhood of the identity W C G$, depending only on 
the class x, such that for any uj G K Sf and any g G W , ij y w L w g fl y w L^ ^ then g G L u . 

Proof of Lemma \6.15[ The first restriction we impose on 8 is that it will be small enough 
so that in the real component, the map (s, u) h-> exp oc (s) •exp 00 (w) from ^ ie ( A °°) x Bj°° — > 
Goo is a homeomorphism onto its open image. Choose B = Yl s B p to be any product 

compact open subgroup of B and define U 5 as in the statement of the lemma. At this stage 
we observe that for any 5 < 5 the map x U 5 —> G$ given by (g, u) !->■ gexp s (u) has an 
open image. To see this, note that the image is a product of open sets in each component: 
In the real component the image equals ■ exp^ Bj 00 which is open by the choice of 5, 
while for any finite place p G Sf, the p component of the image is (HJ) P -B P which is easily 
seen to be open in the following way: Because of the fact that V p = Lie(_B p ) is a linear 
complement to Lie((H^) p ), the product (H u ) p ■ B p clearly contains an open neighborhood 
of the identity in G p . It now follows from the fact that both (H w ) p , B p are groups, that 
their product is actually an open set. 

The above establishes in particular, that the set Tj = y^L^ exp s (U 5 ) C X s is open. It 
follows that in order to conclude that is indeed a tube around y^L^, we only need to 
argue the injectivity of the map (z,u) i-> zexp s (u) from y^L^ x U 5 to Xs- We denote 
this map by ip^. 

The second condition which we impose on 5 and on the choice of B is that the product 
I exp^ (Ej 00 )) • B 2 C W, where W is as in Lemma [6 .171 Assuming the injectivity of ip u 

fails, we obtain elements ubl G B^°° , hi G B, % = 1, 2 and a non-trivial intersection of the 
form 

y w L w exp 00 (w^ ) )6i n y^L^ exp 00 (n^ ) )6 2 - 

This shows that y^L^ fl y u L u exp 00 (M^) exp oc (— u& )&2^r 1 7^ 0- ^ now follows from 
our choice of 8 and B (by Lemma f6 . 1 7[) that exp 00 (u^' ) ) exp OC) (— v2j) G and that 
b 2 bi 1 G H u . As B is a group which intersects H u trivially (this is our assumption that 
the generalized branch £ w is nondegenerte) , we conclude that b\ = Furthermore, from 
the fact that Lie (A^) © = g^, it is straightforward to deduce that if 5 is chosen small 
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enough, then the inclusion exp^wJx?) exp oc (— u^) G implies that Woo — Uoc . This 
establishes the injectivity of ij) u as desired □ 

Proof of Lemma \6.16l We first argue the validity of part (pQ) . As pointed out in the proof 
of Lemma [6.151 above . if u> is such that is nondegenerate, then the set H u ■ B C G$ t is 
open (here B is as in Lemma f6. 1 5|) . Moreover, as B is compact, there exist a neighborhood 
of the identity in Gg f , and in particular, a compact open subgroup K* = Y\ s K*, with 
the property that for any k G K* we have Bk C H^B. We now claim that for any tube 
7~u as in Lemma [6. 151 we have T^k = T*. To argue the inclusion C we note the following 

Tjk = y w L w e Woo {B^)Bk C y u L u expoo(BY°°)H u B = y^ ex Poo (B^)B = T*. 

The opposite inclusion follows by switching k with k~ l . 

If x is non-spilt, it is not hard to see that the intersection r\ we K s {Hu • B) contains an 
open neighborhood around e/. It then readily follows that this intersection contains an 
open neighborhood of B. We conclude similarly to the argument presented above that 
the group K* may be chosen to work for all the cu's simultaneously. 

We briefly argue part (J2]) of the lemma. For each relevant u, it is not hard to see that 
the volume of the tube rns(Tj) satisfies cim^ (Bj°°) < rns(Tj) < C2mv aD (Bj ao ), where 
the constants c\,c% are determined by the volume of the orbit y^L^ and the position of 
the linear space V, from which the width is coming, with respect to Lie(L w ). As the 
2-dimensional volume m^^™) is proportional to 5 2 , the claim regarding a single u 
follows. In the non-split the Lie algebras Lie(L^) are uniformly transverse to V, 

the constant c\ above can be taken to be uniform for all u which finishes the proof. □ 

7. Proof of Lemma [57T1 

We begin with a short discussion that will be helpful later on. Let Sf be a finite set of 
primes. For an element 5 G G$ f we denote £5 = (5) G and we say that 5 is of compact 
type if £5 is a compact group. 

Definition 7.1. Let 5 G G Sf be an element of compact type. Let us denote for any 
admissible radius h by k^(5) the minimal positive integer k for which S k belongs to the 
compact open subgroup a/(h)i^5 / a/(h _1 ). 

Equivalently, k h (5) is the order of (the image of) 5 in the group H$ / (Y^sDa f (h) K 'g a /(h -1 )) 
(note that this group is finite because a/(h)i^5 / aj(h~ 1 ) is open in Gs f )- It turns out that 
the behavior of the function h 1— > kh(S) is essential for the proof of Lemma [5.11 We first 
state and prove the following: 

Lemma 7.2. Let 5 G G$ f be an element of compact type and let h h-> k^(5) be the function 
defined above. Let eh(5) be the positive number defined by the equation kh(5) = eh(S)h. 
Then the function h h-> eh{5) attains only finitely many values and furthermore, ifh n is a 
divisibility sequence (i.e. h„ +1 | h n ), then the sequence eh„(<5) stabilizes. 
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Proof. Step 0. Let us denote for an admissible radius h by n p (h) the integers satisfying 
h = Ylp£S f P np ^ ■ We have the equality 

afWKsjafQT 1 ) = J] a p (p n ^)K p a p (p- n ^), 
pes f 

and so the integer kh(8) is the least common multiple (1cm) of the integers which we 
temporarily denote j p , where j p is defined to be the minimal integer j so that the j'th 
power of the p-component 5 J p belongs to a p (p np ^)K p a p (p~ np ^). A moment of thought 
shows that this implies that the statement of the lemma for a general finite set of primes 
Sf follows from the corresponding statement for a single prime. Hence, from now on 
till the end of the proof we will assume that Sf consists of a single prime p and so 
the admissible radii that will be considered are positive powers of p. Let us denote 
K P:n = a p (p n )K p a p (p- n ). 

Step 1 We claim that for all large enough n, Tig H K Pyn < £5 fl K p t n ^iy To see this, note 
that the fact that 5 is of compact type implies that when we represent an element h G Sj 
by a matrix in such a way that one of its entries is in Z p , then all the other entries must 
have p-adic absolute value bounded by some p r ° for some fixed r (that depends on 5). 
On the other hand 

Kp ' n= \(j n c P d & ) : a ' & ' c ' rf G % P ,ad- be G Z*J . (7.1) 

It follows that for n > r , given h G £5 D K p>n , when we write h in the form given in (17.11) 
we must have that b = pb' with V G Z p and so if we denote d = pc then 

a p-( n -Vb'\ 
-V d J ' 

and furthermore, ad — be = ad — b'd . This implies that h G Kp f ( n _i) as desired. 
The above shows that for all large enough n there is an onto homomorphism 

£a/(£ 5 n K p>n ) -> £a/(E 5 n K Pt{n _ 1} ), 

which implies that k p ( n -i)(S) divides k p n(S) for all large enough n. This divisibility relation 
will be needed in the next step. 

Step 2. Let if be a compact open subgroup of G p and let k be the minimal positive 
integer k so that 5 k G K. The following divisibility relation is straightforward 

k p n(5 ko )\k p n(5)\k k p n(5 ko ). (7.2) 

It is straightforward to show that (17. 2p together with step 1 imply that the validity of 
the lemma for 5 follows from its validity for 5 ko . This allows us to assume without loss 
of generality that we start with 5 G K, where K is some open compact subgroup of G p 
chosen at our convenience. We choose K to be equal to the subgroup of K p consisting of 
elements having representatives which are matrices congruent to the identity modulo p 2 . 

The congruence relation modulo p 2 will be used towards the end of the proof. Moreover, 
the fact that we assume that 5 G K p implies that k p n(5) is the minimal positive integer 




32 



MENNY AKA AND URI SHAPIRA 



k so that 5 k belongs to K p D K pn . Let us denote B n = K p D K p>n . A direct calculation 
shows 

4=((!l)^?4 (T - 3) 

Similarly to step 1, the fact that -B„+i < B n implies the divisibility relation k p n(S)\k p n+i(5) 
for any n. 

Step 3. Working under the above assumptions on 6, let no be the maximal integer n so 
that 5 G B n . 

Claim: For any n < n we have k p n(S) = p n ~ n ° and moreover, 5 P " "° € B„ \ -B n +i- 

Clearly, the validity of the above claim finishes the proof of the lemma. We prove it by 
induction on n. For n = uq the validity of the claim follows from the choice of Uq. Let us 
assume it holds for n. We know from step 2 that k p n(8)\k p n+i(8) and from our inductive 
hypothesis saying that 5V 1 ( 5 ) (fc B n+ i that this divisibility relation is strict. It follows 
that k p n+i(S) = joP n ~ n ° where jo is the minimal positive integer j so that <P° p " "° G B n+1 , 
or said differently, such that the bottom left coordinate of 5 J " P "° is divisible by p n+l in 
Z p . We will be finished once we show two things: 

(1) First, that j = p and so k p n+i(5) = p n+1 - n o^ 

(2) and second, that the bottom left coordinate of <5 p " +1 ™° is not divisible by p n+2 and 
so S k p n+1 ^ G B n+ \ \ B n+2 which completes the inductive step. 

Consider the sequence 5^ pn ™°, j = 1, 2, . . . and denote 



§3P 

and note the recursive relation 



Cj dj 



Cj+i = CiOj + Cjd\. (7.4) 

)ower series ii 

B n \ B n+ i and write 



We expand c\ to a power series in Z p and use the inductive assumption that 6 P " ™° G 



jp 

Cl = m lP n + m 2 p n+l + up n+2 , (7.5) 

where mi G {1, 2, ... ,p — 1}, m 2 G {0, 1, ... ,p — 1}, u G Z p . We claim that for any 1 < j 
we have 

Cj = jmip n + jm 2 p n+1 + Ujp n+2 where ttj G Z p . (7.6) 

The validity of dIJ),([2J) follows at once from (17.61) and the fact that m\ G {1, 2, . . .p — 1}. 
We prove the validity of (17.61) by induction on j. For j — 1, this is exactly (17. 5p . Now 
assume it holds for j and write (using the congruence assumption on 5 established in step 
2) 

aj = 1+ p 2 A, dj = 1+ p 2 D, A,De Z p . 
Plugging this and (I7.5p . (l7.6p into the recursive relation (17.41) we see that indeed 

c j+1 = (j + l)m lP n + (j + l)m 2 p n+1 + p n+2 (. . . ) 
as desired. This completes the proof. □ 
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Remark 7.3. It will be useful later on to note the following: A careful look at the 
argument giving Lemma U7Z\ shows that for a fixed 5 G Gs f of compact type, we have that 
there exists a positive constant c such that c < e^(S) for any admissible radius h, where 
c depends only on two things: 

(1) The power ko which we need to raise 5 to so that each component (5)*° will be in 
K p and congruent to the identity mod p 2 . 

(2) The maximal admissible radius h = flpes P np f° r which for any p G Sf, (5)*° G B rip 
(this h measures how close 5 k ° is to being upper triangular). 

Before turning to the proof of Lemma 15.11 we make yet another remark which will be 
used in the course of its proof. 

Remark 7.4. Given a class x G with a periodic A^-orbit and a representative A x G x, 
then the matrix a^fix) — diag (e^~ , e~~? J stabilizes the lattice A x (that subset of 

IR 2 , A^ = Aj; ( t x ) ) . As A x is a lattice, it follows that a^ir) is conjugate to an integer 
matrix and so its eigenvalues e ± ~f , are algebraic integers of degree 2. The quadratic 
extension from Remark 14.51 is the one generated by them. As these eigenvalues are 
Galois conjugates whose product is equal to 1, we conclude furthermore that they are 
units in the ring of integers of ¥ x and as such, by Dirichlet's unit theorem, they are 
integer powers of the fundamental unit of this field. In fact, the term "fundamental unit" 
will sometimes be used here to refer to the unit in the ring of integers which is of absolute 
value > 1 and which generates the group of totally positive units (i.e. those units both of 
whose embeddings into the reals are positive). If the fundamental unit is e = e 2 , then 
the reader will easily verify that the image A :r a 00 (to) is contained in the Q-span of A x . 
This shows that if we write x = T^gx, then there is a rational matrix 5 X which solves 
&x9x = gxO-oo{to) and in fact, t x = kto where k is the minimal positive integer such that 
5 X is an integer matrix. 

Proof of Lemma I5.il We first argue part (OQ) of the lemma. A short counting argument 
shows that the cardinality of the sphere <Sh(x) is proportional to h (were the proportional- 
ity constant depends on Sf). For each x' on the sphere, let s x > be the minimal positive num- 
ber such that x'aoo(s) returns to the sphere. The total length is then t x (h) = ^2 x i e s h (x) Sx '- 
We will show below that for any x' G Sh(x) s%' < t x - This will establish the inequality 
th(x) ^x,s f h which is half of of the statement in part (Tj[|) of the lemma. The other half, 
namely the inequality h ^x,s f t x (h), actually follows from part ([2]) of the lemma. 

Let x' G <Sh(x) be given. By Lemma [4.11 we see that there exists y G 7r _1 (x) such that 
x' = ir(yaf(h)). As n intertwines the A^-actions on Xs,Xoo we see that x = xaoo^z) = 
^(ycooitx)) and so if we let y = yaoo(t x ) then y G 7r _1 (x) and again by Lemma H~T1 we have 
that x" = 7i(yaf(h)) G S^(x). The following calculation then shows that indeed s x > < t x 
as was claimed: 

x'aoo(t x ) = 7r(ya f (h)a 00 (t x )) = ^{ya^t^afih)) = ir(ya f (h)) = x" G S h (x). 
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We now turn to argue part (J2J) of the lemma. Let C = C gx>ul be a nondegenerate 
generalized branch of Gs( x ) (here x = T^g^. and u G Kg.). Let ¥ x be the quadratic 

field from which the periodic orbit xA^ arises from and let t > be such that e 2 is 
the fundamental unit of ¥ x as in Remark 17.41 In the notation of the same remark, let S x 
be the rational matrix satisfying 5 x g x = g^a^to)- We replace the set of places S by a 
bigger set if necessary, S, so that 5 X G Tg. We then consider the bigger graph Gg(x) which 
contains the original graph and we further consider its following generalized branch: Write 
Sf = SfUT and define u G K§ to be identical to u in the components corresponding to 
the primes in Sf and equal the identity in the components corresponding to primes in T. 
We then define £ to be the generalized branch £ 9x fi of G§(x). Note that because of the 
way we defined u, the generalized branch C is nondegenerate as well. 

Denote as before by x^ the class in CC\Sh(x). We are interested in analyzing the length 
t Xh of the orbit x^A^. By Remark 17.41 there exists a positive integer k^ satisfying 

t Xh = Mo- (7.7) 

In fact, for later purposes, note that in our discussion x and the representative g x are fixed 
but we will play with the branch later on, i.e. with the choice of u (which in our setting 
is defined by u), and so we should actually record the dependency in u in our notation 
and denote k^(uj). The function fch(-) from definition 17.11 and ^h(-) are closely related as 
will be seen below. 

The number k^(cu) is by definition the minimal positive integer such that x^a^kto) = 
Xh or, if we prefer working in the extension Xg, it is the minimal positive integer so 
that Tg(g x , a})a/(h)a oo (fct ) returns to the fiber 7r _1 (a;h). Because of the identity 5 x g x = 
Qxicaito) and the fact that 5 X G we see that T^(g x , c5)a/(h)a 00 (Ho) = T $(g x , 5 x k uaf(h)) , 
and so this point lies in the same fiber as Tg(g x , ujdf(h)) (i.e. above x^) if and only if the 
quotient af(h~ 1 )u~ 1 5 x uaf(h) belongs to K§ (see Remark 13. ip . That is, k h (uj) is the 
minimal positive integer k for which the (co~ 1 S x co) k G aj(h)i^^aj(h _1 ). This establishes 
the equality 

k h (6^) = k h (u). 

The validity of part ([2]) of the lemma now follows immediately from Lemma 17.21 and (17.71) 
which together imply cc gx w (h) = taeh(6%). 

As for part ([3]) we argue as follows: Assume first that x is non-split with respect to Sf. 
As noted in Remark 17.31 the lower bound for the function h h-> eh{S x ), which gives us the 
lower bounds for the functions c/; (h), depends only on two things: 

(1) The smallest power ko for which 5 X belongs to the subgroup of Kg consisting of 

elements congruent to the identity modulo p 2 in each component (note that we 
may ignore the conjugation by u as this is a normal subgroup of Kg ). 

(2) The p-adic norms \c p \ , where c p is the left bottom coordinate of the p-component 
of (uj~ 1 S x u) k ° where p G Sf. 
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It is clear that ko depends only on x and the original set of primes Sf and does not vary 
with u (i.e. with the generalized branch). Also, for primes p G Sf, the p-adic norm \c p \ 
is bounded from below as u ranges over Kg f because x is non split. Finally, for the 
primes p G Sf\ Sf, as the p'th component of co equals the identity, the p'th component 
of (u~ 1 5 x u) k ° is independent of lo. We conclude that 



as desired. We leave it as an exercise to the reader to show that in the split case this 
infimum equals zero. 



In this section we will prove the results stated in §2.31 The main goal is the deduction 
of Theorem 12.11 from Theorem 12.81 Throughout the rest of this paper we use slightly 
different notation than that previously introduced. We let G = PSL 2 (IR), T = PSL 2 (Z), 
and X = T\G. We denote the projection from G to X by n. The group G acts on the 
upper half plane EI = {z = x + iy : y > 0} by Mobius transformations; if g is represented 
by a matrix 



then for z G H, gz = ff^. This action preserves the hyperbolic metric ds 2 = dx ^ dy and 
so induces an action of G on the unit tangent bundle T 1 H. The action of G on T 1 H is 
free and transitive hence allows us to identify G with T 1 H once we choose a base point. 
We make the usual choice of the base point to be the tangent vector pointing upwards 
through % G H. With this identification the geodesic flow on G = T 1 M corresponds to the 
action from the right of the positive diagonal subgroup 



Remark 8.1. The reason we chose to work in previous sections with the group PGL 2 
rather than with PSL2 which fits better for our application is that if one works with 
PSL 2 , equation (14. lip , which lies at the heart of our arguments, needs to be adjusted. 
The diagonal matrix diag (q, 1) is no longer an element of the group (and so not an 
element of the lattice) and needs to be replaced with diag (q, q~ l ). This will then produce 
a slightly weaker relation, namely it will relate the geodesic loop corresponding to a 
quadratic irrational a to the one corresponding to q 2 a (rather than qa). This would have 
allowed us to prove only equidistribution along sequences of the form q 2 a. Note however 
that what makes it possible for us to apply the results of previous sections to the above 
settings is that the natural map from PSL 2 (Z)\ PSL 2 (M) to PGL 2 (Z)\ PGL 2 (M) is one to 
one and onto. This is not the case when one replaces R, Z with other rings. 



inf {cc gx w (h) : lo G Kg , h is an admissible radius} > 



□ 



8. Applications to continued fractions 




(8.1) 



A = {a(t)} = {diag (e t/2 , e~ t/2 ) : t G R} < G. 



In the following subsections we briefly recall the connection between the geodesic flow 
on X and continued fractions. As mentioned in the introduction, this connection was 
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discovered by Artin in |Art82j and described in great detail in [Ser85j . The reader who 
is unfamiliar with this material is advised to consult [EWllj §9.6] as we shall follow the 
exposition presented there. 

8.1. Cross-sections. We now wish to introduce the notion of a cross-section. We are 
being rather restrictive below as we only want to discuss a specific example hence we see no 
use in greater generality. Given a Borel measurable set C C X, we let r c '■ C — > IR> U{oo} 
be defined by rc{x) = inf {t > : xa(t) EC}. The function rc is called the return time 
function to C. The set C is called a cross-section for a(t) if the return time functions for 
positive and negative times are bounded from below by some fixed positive number and the 
map (x, t) i — y xa(t) from {(x, t) : x G C, < t < rc{x)} — > X is a measurable isomorphism 
onto its image in X. The first return map Tq is defined to be Tq(x) = xa(rc{x)), where 
this makes sense; i.e. for x belonging to {x G C : rc(x) < oo}. In fact, we will be interested 
only in points which return infinitely often in the future and past to C, thus we define 
the domain of the first return map to be 

Dohit c = {x G C : there are infinitely many (8-2) 
positive and negative t's with xa(t) G C}. 

Note that Tq '■ Domr c — > Domj^ is invertible. 

We now wish to define the relevant cross-section for the geodesic flow in X. An element 
g G G represented by a matrix as in (18.11) corresponds to a tangent vector of unit length to 
the upper half plane. It then defines a geodesic in EI which hits the boundary of EI in two 
points. We denote the endpoint and startpoint of the geodesic it defines by e + (g),e^(g) 
respectively. Clearly we have e + (g) = -, e_(g) = k, where we allow oo as a possible value. 
Any element g G G has a unique decomposition (the Iwasawa decomposition) of the form 

,. w v, (I t \ { e°/ 2 \ f cos9 -sin£\ 
g = n(t)a(s)k e =^ Q ± j ^ Q g _ s/2 ) ^ ^ ^ j, (8.3) 

where t,sGl, and 6 G [0, it). The notation n(t), a(s), kg should be understood from (18. 3p . 
An element g having the above decomposition corresponds to the tangent vector to the 
point t + ie s G EI of angel 29 in the clockwise direction from the vector pointing upwards. 
Consider the following sets: 

C + = {g = a(s)k 6 eG:e + {g)e (0, 1), e_(#) < -1} ; 

C- = {g = a(s)k e G G : e + (g) G (-1,0), e_((/) > 1}; (8.4) 

c = c+u<r. 

The set C consists of those tangent vectors whose base-point lies on the imaginary axis 
with some restriction on the angle 9 related to the height e s of the base point. It should 
be clear from the geometric picture described above that the range of 'allowed angles' for 
such a tangent vector, say in C + , is a subinterval of (f , | ) with | being its right-end-point. 
In §2] we will workout these intervals exactly. We denote the sets tt(C), 7r(C + ), vt(C _ ) by 
C,C + ,C~ respectively. The following lemma is proved in |EWllt §9.6]. 
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Lemma 8.2. The following hold 

(1) The set C injects into X under it; that is, for each x G C corresponds a unique 
g G C with n(g) = x. 

(2) The set C is a cross-section for the geodesic flow on X . 

(3) The domain of Tq corresponds to those g G C for which both e + (g),e_(g) are 
irrational. 

(4) For g G C + , if Tc(ir(g)) is defined, then Tc(ir(g)) G C~ . An analogue statement 
with + replaced by — holds. 

It will be convenient for us to introduce a 'thickening' of the cross-section C which will 
denoted by B. The following lemma is left to be verified by the reader. 

Lemma 8.3. There exists a constant e > (which will be fixed throughout) such that 
the following statements hold 

(1) The the map (g,t) h- > ga(t) from C x (0,eo) to the set 



is one to one and onto, and the set B is open in G. 
(2) Let B = ir(B). The restriction n : B — >■ B is one to one and onto and the set 
B C X is open. 

The constant eo introduced in the above lemma is a lower bound for the return time 
function, rc, to the cross-section C. The importance of part (j2J) of the above lemma is 
that it gives us a well defined way of lifting points in X near the cross-section to the group 
G in which it is more convenient to work. The combination of parts ([T]) and (j2J) gives 
us natural coordinates on B; any point x G B can be written uniquely as xcct{t) where 
Xq G C and t G (0, e ). 

In our discussion we will encounter certain measures on the cross-section C which are 
invariant under the first return map and we will need a procedure to construct from them 
measures on the ambient space X which are invariant under the geodesic flow; that is, 
under the action of the group A. 

Let jibe a. probability measure on C. We define the suspension of /2 to be the measure 
on X which is given by the following rule of integration: For / G C C (X) 



Lemma 8.4. If /i(Dom^ c ) = 1 and jl is Tc-invariant, then the suspension is A- 
invariant. Furthermore, crp,(X) = J c rcdfl. 



B = {ga(t):geC,te(0,e )} 



(8.5) 




(8.6) 



Proof. This is follows from [EW lll Lemma 9.23] taking into account that Tc is invertible 
on Dohiju. □ 
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Definition 8.5. Given a function / : C — > C, we denote by / : X — > C the following 
function 

f(xc) if x G B has coordinates (xc,t), 
if x <£ B 



Note that with the above definition, given a measure jx on C and a function / : C — > C, 
equation (18. 6p translates to the following useful formula which will be used frequently 
below 



fda^ = e / fdjl. (8.7) 
x Jc 

8.2. The Gauss map. Let I = (0, 1) and S :/—>■/ be the Gauss map; i.e. the map 
defined by the formula S(y) = - — |_^J • Note that strictly speaking S(y) is not in I for 

points of the form y — — . The reader will easily verify that S n (y) is well defined for all 
positive n if and only if y is irrational. This slight inconvenience will not bother us as 
we will only apply the Gauss map to irrational points. Let L irr — I \ Q. Consider the 
following subsets of IR 2 : 



D = :yel,0<z< — j , D m = {(y,z) G D : y G J irr } . (8.8) 

Let S : -D — > D be the map given by <S(?/, z) = (S(y),y(l — yz)) and note similarly that 
strictly speaking, in order to iterate S as many times as we wish we need to restrict to 
points in D 1TT . Recall (see for example |EWlll §3.4]) that the normalized restriction of the 
Lebesgue measure on IR 2 to D, which we denote here by A, is an 5-invariant probability 
measure. This is the so called invertibl^ extension of the Gauss map as when one projects 
on the first coordinates, one recovers the Gauss map and the Gauss-Kuzmin measure v 
introduced in the introduction. That is if p : D — > I denotes the projection on the first 
coordinate, then 

p*\ = v. (8.9) 
As we will see below, the dynamical system S : D — )■ D can basically be identified with 
T c : C -)• C. 

8.3. Relation to the Gauss map. Consider the maps r + : C + — > D, t_ : C~ — > D 
defined by the following formulas: For x = ir(g) G C, where g G C is of the form (18.11) : 

For g G C + , r + (x) = (e+(g), ——^ --) = (-, cd), (8.10) 

e+(g)-e-{g) c 

For g G C~, r_(x) = (-e+(g), -— --) = ( , -cd). 

-e + {g) + e_{g) c 

We let t : C — > D be the union of r + and r_. The formulas in (18.101) can be stated 
geometrically as follows: For a tangent vector g G C and x = ir(g), t(x) = (y,z) G D, 
where y is the absolute value of the end point of the semicircle corresponding to g and 



10 The term 'invertible' refers to the fact that when restricted to a subset of D, S is indeed invertible. 
This subset is obtained by neglecting a certain set of Lebesgue measure zero (see [EWlll Prop. 3.15]). 
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is the diameter of it. For any endpoint y G (0, 1) (resp. y G (—1, 0)) and any diameter 
z^ 1 > 1, we can attach a well defined semicircle in EI which corresponds to a unique point 
in C + (resp. C~). This shows that r is one to one and onto (and in fact, a homeomorphism) 
from C + (resp. C~) to D which is the area below the graph of the function y i— > (1 + y) . 
The following basic lemma is proved in |EW114 §9.6]. It establishes the link between the 
geodesic flow and the Gauss map. 

Lemma 8.6. The following diagram commutes (for points x G C for which Tc(x) is 
defined) 




Note that as r : C — > D is 'almost' an isomorphism (it is two to one), and so the 
above lemma basically says that any dynamical question about the dynamical system 
5 : D — > D can be pulled to a dynamical question on Tc : C —> C . In our case the 
dynamical question is that of equidistribution of certain S'-invariant measures. Using the 
suspension construction we will see that the equidistribution questions for the dynamical 
system Tc : C — > C translate to equidistribution questions of certain A-invariant measures 
on X. 



8.4. Reducing the statement of Theorem 12.11 to the cross-section. We will be 

interested in two types of measures on the cross-section C defined above. The first is the 
following version of the Lebesgue measure: We use r + (resp. r_) to pull the (normalized 
restriction of) Lebesgue measure A from D to C + (resp. C~) and denote the resulting 
measure by A + (resp. A~). Further denote A = |A + + ~A~. Clearly A is T^- invariant and 
r*(A) = A. 

The second type of measures on C are those coming from quadratic irrationals. We 
recall some notation introduced earlier. Let a be a quadratic irrational. Let g a be as 
in ( 12. 5p . We chose to define g a as we did so as to ensure that its determinant is positive 
and hence it corresponds naturally to an element of G with endpoint a. Let x a G X be 
the corresponding point (that is x a = 7r( -^= g a )) and fx a the A-invariant probability 
measure supported on the periodic orbit x a A = {x a a{t) : t G [0, t a )}, where t a is the 
length of the orbit (see Lemma l4.4p . We claim that the intersection C R x a A is a non- 
empty finite set contained in Dohit c . In fact, any geodesic in the upper half plane that 
corresponds to a semi-circle, projects to a set in X that intersects C non-trivially. By 
Lemma I8~2ll3] ). if the end points of the geodesic are irrational, the intersection is in Domj^. 
Finally, the finiteness follows from the fact that C is a cross-section together with the fact 
that the orbit x a A is of finite length. 

Let us denote by jl a the normalized counting measure on C fl x a A. Clearly jx a is 
invariant under the first return map Tc- Recall that we denote by v a , the normalized 
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counting measure supported on the period P a C /;„ of the orbit of a mod 1 under the 
Gauss map (see the last paragraph of §1.2p . 

Lemma 8.7. Let p : D — )■ I be the projection on the first coordinate. Then 

(pot)„(A) = i/, (8.11) 

(P ° T)*(jl a ) = U a . 

Proof. The first equality in ( 18. lip follows from the fact that t*(\) = A (which is basically 
the definition of A) and the observation p*(A) = v which was pointed out in (I8.9p . We 
argue the second equality: By Lemma 18.61 the measure r*(/i) is ^-invariant. By the 
above discussion it is finitely supported. Since x a A is a loop, the first return map Tq 
acts transitively on the support of fi and so the support of r*(/2) consists of a single S 
orbit. This clearly implies that (p o r)*(/2) is supported on a single periodic orbit of the 
Gauss map S. Denote this period by P' a . We only need to argue why P a = P' a , which is 
equivalent to P a \~\ P' a ^ 0. 

Consider the matrix g a defined in (12. 5p . The tangent vector corresponding to g a defines 
a geodesic in T 1 H which is a semicircle with endpoint e+(g a ) = a. At some point along 
this geodesic we find a point g which projects to C + under ir. Let x = ir(g) G C + and 
g' G C + the corresponding point in C + . Clearly x is in the support of jx a and hence the 
endpoint e + (g') = j)or(x) is a point of P' a . As the semicircle corresponding to g a and the 
one corresponding to g' are related by the action of some 7 G T as M0bius transformation, 
it follows that the endpoints a, e + (g') are related by the action of 7 as well. This action 
can effect only finitely many digits of the c.f.e of a and we conclude that the periods of 
the c.f.e of a and of e + (g') must be the same (up to a possible cyclic rotation) which 
finishes the proof. □ 

In light of (18.111) the following theorem clearly implies Theorem 12.11 This is mearly the 
corresponding statements when translated to the cross-section. 

Theorem 8.8. Let a, S, q, and e be as in the statement of Theorem \2.1[ For any k- 
Lipschitz function f : D — > C 



f rdn qa - fo rd\ 



qa 

C JC 



< a ,5 ie max{||/|| 0O) 4ht(g)-^ +e . (8.12) 



8.5. Relation to the equidistribution of the loops. In light of Theorem 12.81 the 
content of the following lemma is clearly relevant for the proofs of Theorem 18.81 above. 

Lemma 8.9. Let a be a quadratic irrational. The suspensions a^a^ a of the probability 
measures X,p, a are proportional to mjf,/i a respectively. 

Proof. The fact that o~~ x is proportional to the Haar measure mx is proved in |EWllt 
p. 325-326]. The outline of the proof is as follows: By Lemma |8.4[ is A-invariant. 
One shows that it is absolutely continuous with respect to mx and deduces the result 
from the ergodicity of mx with respect to the A-action. Regarding a^ a , note that it is 
clearly a measure that is supported on the orbit x a A and it is A-invariant by Lemma \8. 41 
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The assertion now follows from the uniqueness (up to proportionality) of an A-invariant 
measure on the periodic orbit x a A. □ 



8.6. Concluding the proof of Theorem 18.81 The argument yielding Theorem 18.81 is 
slightly technical because of the following issue: We start with a K-Lipschitz function 
/ : D — > C and construct from it the function / o r : X — > C as in Definition 18.51 As 
we wish to appeal to Theorem 12.81 we need to remedy / o r to be Lipschitz in a way that 
will allow us to control its Lipschitz constant. In order to achieve this we shall need the 
following technical lemma which is proved in §|9j 

Lemma 8.10. For any M > 1 and < p < 1 there exist a function p = (p Pt M '■ X [0, 1] 
with the following properties 

(1) The function p is p _1 -Lipschitz. 

(2) We have f x l- <pdm x < M" 1 + plogM. 

(3) Given f : D — >■ C a K-Lipschitz function, the product f or-p : X — > C is Lipschitz 
with Lipschitz constant max{||/|| , k} p~ x M . 



Proof of Theorem \8.8\ Let a, q, S, e be as in the statement of the theorem and / : D — > C 
a K-Lipschitz function. Let cq, c q be the proportionality constants satisfying cqitlx = 
°A> c q fi qa = cr^ qa whose existence is given by Lemma [8791 Using (18. 7p we have the following 
estimate: 



/ o rdfi 



c 



qa 



[ fordX 




cq r 






eo Jx 



f o Td/2 qa - — I f o rdm 



e o Jx 
e 1 + c e 1 



f o rdfjL, 



qa 



X 



(8.13) 
/ o rdmx ■ 



x 



(**) 



We first estimate the expression (**) in f !8. 13[) . Given M > 1, < p < 1 we let tp — p p> M 
be as in Lemma [8.101 and denote if) = I — ip. 



for-((p + ip)dfx 



qa 



< 



for- pdfx qa - I for- pdm x 
x Jx 



f o r ■ (<p + i))dm x 

for- ijjd^ qa 



X 



+ 



(8.14) 
for- i/jdmx ■ 



x 



We will estimate each of the three summands in the right hand side of the inequality (18. 14j) 
By Lemma 18. 101 (121) we have 



for- ipdmx 



x 



< 



^mx« H/IUM-i+plogM). 



(8.15) 



x 
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Next, note that by Lemma [8.10ITT1) ip is p 1 -Lipschitz and so we may apply Theorem 
which together with the estimate (IS. 15)) yields 



for- ipdn 



qa 



X 



< 



X 



ipdp qa 

/ ipdmx + max jl, p" 1 } ht(g)~^ 
Jx 



« \\f\L fa' 1 + plogM + p- 1 ht(q)-^ 



$.16) 



Finally, by Lemma [8 . 1 0fj 3|) we can apply Theorem 12 .81 to the function / o r-<p and conclude 
the following 



for- pdfi qa - / /or- <pdm x 



P~ 



< a ,s, e max 



ht(g)3 ( ^" e) and combine estimates (EUSD dUS} dHUD 



K}p _i Mht(g) 



2 



(8.17) 



We now make the choice M 
into (18.141) to obtain 

(**) ^^maxlll/l^^jht^)-^, (8.18) 

where in the above inequality | was replaced by e and the term log(ht(g)) was absorbed 
in the term ht(g) € . 

In order to finish we need to further estimate (*) in ( 18. 13)) . To obtain this estimation 
from the above we take / : D — > C to be identically 1 and note that in this case / o r = xb 
and so using (18. Tj) we have 



/ o rdfi qa - fo rdm 



x 



\p qa {B) - m x {B)\ 



fo _ fo 

\Cq C 

The left hand side of (18. 19)) is (**) for this choice of / and so by (18 . 18)) we obtain 



x 



$.19) 



\c„ 



-1 



c, 



■0 1 1 <a,5, e ht(g) 8~ 



(8.20) 

This establishes that c q — > cq as ht(q) — > 00 and therefore in particular, for all but finitely 
many g's c q > y. For such q's we conclude from (18.20)) that 

(*) = \c q - c | < Qi5 , e ht(g)-^ +£ . (8.21) 

As there are only finitely many problematic g's we may simply choose the implicit constant 
to be large enough so that ( 18.21)) will hold for any rational q supported on S. Note that 
this change in the implicit constant depends on S and a. 

Plugging this estimation of (*) together with ( 18.181) to ( 18.131) we obtain the desired 
inequality ( 18.12)) appearing the statement of the theorem. □ 

Proof of Corollary \2.5l We use the notation introduced in the proof of Theorem 12.11 pre- 
sented above. For a quadratic irrational a let P a denote the support of jl a . It follows 
from (18.111) that por(P a ) — P a . It is straightforward to argue that the map por : P a — > P a 
is always 2 to 1. The proportionality constant c q defined by c q fi qa = a^ qa can be rewritten 
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as follows: The left hand side of the equation fi qa (B) = ^ (see ( 18.1 9p ) can be expressed 

in a different way; the geodesic x qa A which is of length t qa penetrates B exactly P qa 

times and stays in B along a time interval of length eo each time and so fi qa (B) = \ p * a \' e ° . 

It follows that c" 1 = ipsi. By Lemma O we see that t qa = c,c(ht(g)), where the gen- 
eralized branch C is one of the finitely many rational generalized branches £gl, e/ (see 
the notation introduced in Remark I4.3I1 5])). We conclude from Lemma [5. II that the ratio 
c a(q) — 7^ attains only finitely many positive values and that if q n is a sequence such 
that ht(g n | ht(g n+ i), then c a (q n ) stabilizes. It now follows from (I8.20p that 



ca{q) W)~ c ° 



as desired. □ 
Sketch of proof of Theorem \2. 7[ In the proofs of Theorem 12.11 and Corollary 12.51 we ob- 



tained results about the c.f.e of numbers of the form qa. The information was extracted 
from an understanding of the measure fi qa which is supported on the periodic A-orbit 
through the point on the rational generalized branch £g', e/ which lies on the sphere 
Sht(q)(%a) in the S'-Hecke graph Qs{x a ). The information about the c.f.e of q{j ■ a) (here 
7 G T and 7 • a denotes the action of 7 on a as a Mobius transformation), is obtained by 
studying the periodic orbits through points on the rational generalized branches CJ q _ x . 

The instances in the proofs of Theorem 12 . 1 1 and Corollary 12. 5l in which implicit constants 
depending on a appeared, were when we appealed to Theorem 12.81 and Lemma 15.11 If 
instead of appealing to Theorem l2.8l we appeal to Theorem I4.9[ we see that in the case x a 
is non-split (that is, no prime in S splits over Q(a)), the implicit constants may be taken 
independent of the generalized branch. This implies the validity of the theorem. □ 

9. Construction of tp - Proof of Lemma 18.101 

9.1. Motivation. We start with a function / : D — > C which is K-Lipschitz and we 
consider the function / : X — > C given by / = / o r. The points of discontinuity of / are 
contained in dB. We wish to find an approximation of / which is not only continuous 
but for which we will have clear control on its Lipschitz constant. To achieve this, we 
construct an auxiliary function ip which vanishes in an e-thickening of dB and is equal 
to 1 outside a 2e-thickening of dB. This will clearly make / • (p continuous, but in order 
to control its Lipschitz constant we will have to make ip vanish 'high in the cusp' where 
the differential of r explodes (see Lemma [9.61 below) . Along the construction we need to 
pay attention to two more quantities which we should control: The Lipschitz constant of 
ip and J ip, where ip — 1 — ip. These clearly fight one against the other; in order to make 
J ip small we wish to take e (which control the above thickening) to be small which makes 
the Lipschitz constant of ip large. 

Below, in §9.2ti9.5[ we discuss a somewhat eclectic collection of observations that we 
will use in order to carry out the arguments in §9.61 with little interruption. 
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9.2. General metric observations. Let (Y, d) be a metric space. For a subset F G Y 
we denote 

(F) e = {yeY:d(y,F)<e}; 

that is, the set of all points of distance < e from F. The following general construction 
allows us to build Lipschitz functions in abundance. The proof is left to the reader. 

Lemma 9.1 (Fundamental construction). Let (Y,d) be a metric space and F C Y a 
subset. For e > define (p e> F : Y — > [0,1] by ip e ,F(y) — min {1, e _1 d(y, F)}. Then <p e) p 
attains the constant values on F and 1 on Y \ (F) e . Furthermore, (p 6j F is e~ l -Lipschitz. 

We now make two remarks regarding Lipschitz constants: 

Remark 9.2. Consider two functions, / : Y — > C and if : Y — > [0, 1], on a metric space 
(Y, d) and assume that they are k/, K^-Lipschitz respectively with k v > 1. Then, for any 
x,y £ Y we have 

|/ • <p{x) - f ■ <p(y)\ < \f(x) - f(y)\<p(x) + \f(y)\ \<p(x) - <p{y)\ 
< 2max{K/, \\f\\ 0O }K v d(x,y), 
that is / • cp has Lipschitz constant max{«/, H/Hqq} 

Remark 9.3. Let / : Y — > C be a continuous function on a metric space (Y, d) in which 
between any two points x, y there exists a path whose length equals d(x,y). Suppose 
there is an open cover {Uj} of supp(/) such that for each i the restriction / : C/j — > C is 
K-Lipschitz. Then we claim that / is /t-Lipschitz as a function on Y. To see this, take two 
points x,y EY and connect them by a path 7 whose length is d(x, y). As / is assumed to 
be continuous we can turn the open cover {Ui} of the support of / to an open cover of Y 
by joining in the open set Uq = Y \ supp(/). Clearly / is K-Lipschitz on Uq as well. Now 
let e > be a Lebesgue number for the induced open cover of the path 7. Choose points 
x = Xo, x\ . . . x n = y on 7 in a monotone way (so that d(x, y) = Y^i d(xj, Xi-i)) and such 
that the distance between Xi to is less than e. It follows that for each 1 < i < n 
there exists an open set from the cover Uj. such that G Uj.. As / is assumed to 

be K-Lipschitz on U^, we conclude that 

n n 

I/O) - f(y)\ < ^2 ~ f( x i-i)\ ^ ^Kd(xi,a;i_i) = Kd(x,y). 

1 1 

9.3. Coordinates. We wish to define a convenient coordinate system which will allow 
us to carry out the relevant computations. Recall the open subsets B, B of of X, G 
respectively that were defined in Lemma 18.31 We define similarly to (18.41) 

B + = {ga(t):geC + 1 te (0,e )} (9.1) 

B~ = {ga(t):geC + ,te(0,e )}. 

A point g G B can be written uniquely in the form a(s)kea(t) where s G R, t G (0, eo) and 
the angle 9 G [0, n) has some restrictions on it, arising from the requirements about the 
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endpoints of the semicircle corresponding to g. We shall refer to (s, 0, t) as the coordinates 
of the point g G B or of the corresponding point iz{g) G B. 

As the action of a(t) from the right does not effect the endpoints, the restrictions on the 
^-coordinate are a function of s alone. We workout these restrictions for, say, g G B + : We 
already observed (after (18. 4p ) that G (|, |) (in order to ensure that e + (g) G (0, 1)). It 
is easy to see from the definition of the start and end points that for s G R, a(s)kg G C + , 
where G [0,7r), if and only if e s cot G (0, 1) and — e s tan0 < —1. This is equivalent to 
saying tan0 G (min {e s , e~ s } , oo). We choose an inverse tan -1 : R — y (0, |) and conclude 
that for a given s, the range of allowed angles for points g G B + with coordinates (s, 0, t), 
is an interval which is defined by 

It = (0min(s), where ^min(s) = tan-^min {e s , e" s }) > -. (9.2) 

Let us denote 

£+ = {(s,9,t) Gl 3 :seR,te (O,e Q ),0 G /+} , (9.3) 
and define similarly £~ and £ = £ + U £~. Let £ : IR 3 — > G be the function 

£(s,e,t) = a(s)k e a(t). (9.4) 
Clearly, we have £(£) = B, £(£+) = B + , and £(£-) = B~ . 

Lemma 9.4. There is an absolute constant c such that for any e > 0, an e-ball in £ is 
mapped by £ into a ball of radius ce in B. 

Proof. We link any two points gi = £(sj, 9i, tj) G B, i — 1, 2 by the path which changes lin- 
early the s-coordinate first, then the ^-coordinate, and finally the t-coordinate. Each such 
change corresponds to the action from the right by a one-parameter subgroup h(t) as in 
Lemma [3731 The change in the s-coordinate corresponds to h(s) = a(—ti)k_Q 1 a(s)kg 1 a(ti), 
the change in the ^-coordinate corresponds to h(6) = a(—ti)kga(ti), and finally, the change 
in the t-coordinate corresponds to h(t) = a(t). As the family of one-parameter subgroups 
that are involved in this process are conjugations of a(t) and kg, where the conjugating 
element is varying in a compact set, we conclude that the norm of the derivative at the 
identity h(0) is <C 1 for some absolute implicit constant. Lemma [3.31 implies then that 

^0(91,92) < |si - s 2 | + 1 6>i - 2 1 + |ti - t 2 \ , 
which establishes the claim. □ 

9.4. Height. The map r defined in ( 18.101) was considered so far as a map from the cross- 
section C. As we wish to use differentiation it will be more convenient to extend it to a 
map r : B — > D in the following way: Given a point x G B it can be written uniquely 
as xca{t) where %c G C and t G (0, e ). We define t(x) = r(xc)', that is, we view r as a 
function on B which is constant along the direction of the geodesic flow. 

As will be seen shortly, the norm of the differential of r : B — > D is not bounded 
and so, in order to be able to control the Lipschitz constant of the function appearing 
in Lemma I8.1Q[[3"] ) we need to force its support to be contained in a domain in which we 
have some control on lid r II. 
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Recall the Iwasawa decomposition (18. 3p . Let J 7 denote the usual fundamental domain 
of T in G, that is, 

jr = ^n(t)a{s)k e G G : \t\ < i, t 2 + e 2s > 1 J , (9.5) 

T = ! n (t)a(s)k e G G : \t\ < i t 2 + e 2s > 1 

We define the height function ht : G — >■ K to be ht(g) = e s if g = n(t)a(s)kg. This is indeed 
the imaginary coordinate of the base-point of the tangent vector to EI corresponding to 
g. This function respects the identifications induced by F on the boundary of J 7 and so 
descends to a function (which we continue to denote ht(-)) on X. For any M > 1 we let 

U M = {g e 7 : ht(g) >M}, K M = {g G T : ht(^) < M} ; (9.6) 
H M = {x e X : ht(x) > M} , K M = {x e X : ht(x) < M} . 

Remark 9.5. It is well known that mx{HM) = ^lg{Hm) = M^ 1 , which is an identity 
that will be needed later (need to add reference). 

9.5. Estimating norms of differentials. 

Lemma 9.6. The differentials of t : B — > D and ht : X — > R at a point y satisfy 
\\d y r\\ <ht(y),||d v (ht)|| <ht(y). 

Proof. We calculate for example \\d y r|| for y e 5 + (here B + = -k(B + )). Let iV, H, and 
denote the respective derivatives at time t = of the one parameter subgroups n(s), a(t), 
and ke which appear in (18. 3ft ; 



N= ° 1 ) H= 1 ° ) W=( ° 1 

V o o ) ' v o -l y ' v -l o 

Let g G £> + be such that y = n(g) and write (7 as in (18. ip so that r(y) = (-,cd) as 
given in (18.101) . The tangent space T y {X) is identified (as an inner product space) with 
T g {G) which is in turn identified with the Lie algebra g = sfe(M) via the map sending 
a matrix V G Q to gV; here we make a choice of an inner product on g which induces 
the left-invariant Riemannnian metric on G and hence on the quotient X. Thus, we will 
obtain an upper bound for the norm of d y r if we calculate an upper bound for the norms 
in M 2 of the vectors d y r(gV) for V = N, H,W (where here we abuse notation and think 
of d y t as a map from T g {G) to IR 2 ). 

We may think of the above 2x2 matrices as vectors in IR 4 (where the first row corre- 
sponds to the first two coordinates) and then we get that d y r is given by the matrix 

/ i -4 

y \Q d c 

A short calculation shows that 



\ , , TTN ( \ , , TTrX ( cr 2 




d y r(gN) = ; 2 , d y r{gH) = " , d y r(gW) 
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We conclude that Hd^ r|| <C max{c 2 , c -2 , d 2 }, where the implicit constant comes from the 
fact that we did not specify an inner product on $j. Writing g in its (s, 9, t)-coordinates 
g = a(s)kga(t) we calculate c,d and conclude that as \t\ < eo, ||dj,r|| <C e' s L Remark l9~8l 
now gives Hd^rH <C ht(y) as desired. 

We briefly describe the estimate for d y (ht). Let g G T be such that y = n(g). Assume 
for a start that the Iwasawa decomposition of g is given by g = n(t)a(s). Then the 
derivative in the directions of W and N are trivial (because the actions from the right 
of the one parameter groups kg,u(t) do not change the height). The derivative in the 
direction of H is e s which equals ht(y). It follows that for such points ||dj,(ht)|| <C 
ht(y). Now for the general case, let g = n(t)a(s)kg G T be the Iwasawa decomposition 
and consider the composition G — > G — > R given by first acting on the right by k^g 
and then applying ht. As ht is invariant under the action from the right by k_g, this 
composition equals ht. Its differential at y equals by the chain rule to the composition of 
the differential of right multiplication by k_g at the point y and the differential of ht at 
the point y' = 7f{g'), where g' = n(t)a(s). As right multiplication by k_g is an isometry 
the first differential has norm 1 (here we use the fact that the left invariant Riemannian 
metric we chose on G is also right {/cgj-invariant). We evaluated the norm of the second 
differential before and we conclude that the composition satisfies the desired estimate. □ 

Remark 9.7. As the differential of ht : X R is < M on K M . It follows that it 
is Lipschitz there with a Lipschitz constant M (see Remark 19.91) . We conclude that 
there exists some absolute constant i (which is the implicit constant in the estimate 
Height)!! -C ht(y)), such that the following two statements hold 

(1) For any < e < 1, {H M ) e C Hm. 

(2) For any < e < 1, (K M ) t C K m . 

To see ([1]) for example, note that if this was false, then we could find x G Km the distance 
of which from Hm is < 1. We conclude that there must be a point x' such that ht(x') = M 
and dx{x,x') < 1. This of course contradicts the fact that ht is M-Lipschitz on Km- 

Remark 9.8. We wish to comment on the height of a point y = ir(g) G B, where g G B 
has coordinates (s,8,t). By Lemma 13.3} if we let g' G C be the point with coordinates 
(s,9,0), then dc^g.g') eo (here we take h{t) = a(t) to 'cancel' the t-coordinate in at 
most eo time). The height of g' is by definition ht(g') = e' s ' (the reason for the absolute 
value is that g' might be in the lower fundamental domain kuT). We conclude from 
parts (CQ) , fl2]) of Remark EH that 

\s\ - \og£ < log(ht(p)) < \s\ + \og£. 

9.6. The argument. 

Proof of Lemma \8.10\ Fix M > 1 and < e < 1 (below e replaces the number p in the 
statement of Lemma l8.10p . Let FcXbe defined by 



F = (dB) e U H M . 



(9.7) 
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Define (p e p : X — > [0, 1] as in Lemma 19.11 To ease the notation we simply denote it 
by ip bearing in mind the dependencies on e, M. Lemma 19.11 implies the assertion in 
Lemma IS.lOl ffT]). Let ip — 1 — (p. As tp attains the value 1 on X \ (F) e we have that 
ip < X(F) e - Furthermore, by Remark 19. 7[jT!) and from the definitions we see that 

(F) e c (dB) 2e U (H M ) e c ((dB) 2e n K M ) U Km. 

It follows that 

/ ^dm x < m x (((dB) 2e n K M )) + m x {HM). 
Jx 1 

Hence, by Remark l9.5[ Lemma r8.10K !2l) will follow once we show that the following estimate 

holds for all M > 1 

mx {{{dB) 2e n K M )) < e log M. (9.8) 

In order to establish (19. 8p we argue as follows: We first want to pull the calculation to 
G and then to M. 3 . It is clear that n(dB D /Cm) — 9B fl Km and as 7r can only decrease 
distances (that is 7r is 1-Lipschitz), we must have n((dB) 2e fl /Cm) ^ (9B) 2e fl Km • By the 
definition of the measure mx it follows that 

m x ((dB) 2e n Km) < m G ((aB) 2e n /Cm). (9.9) 

Hence, we are reduced to estimate mc{{dB) 2e r\lCM)- We will workout below the estimation 
for m G ((<9Z3 + ) 2e fl /Cm) only. Let N e (L) denote the number of e-balls needed to cover a 
set L. Clearly, 

NU(dB + ) 2e n /Cm) < N e (dB n /C M ). 
We know that a ball of radius e in G has volume e 3 and so we deduce that 

m G ((dB + ) 2t n /Cm) « e 3 iV e («9Z3 + n /C M ). (9.10) 
Consider the following four subsets of £ + C M 3 which are mapped by £ onto the boundary 

Qi = {(s, 9, t) : s G R, t e (0, e ), 9 = 9 min (s)} ; 

Q 2 = {( S ,^,t): S GM,tG(O,eo),0 = |}; 

Q 3 = {(s,e,t):seR,eeI+,t = 0}\ 

Q 4 = {(s,9 } t) :seR,9eI+,t = e }. 

Let Q = Uj-iQi- A point in B fl /Cm with coordinates (s, t) must satisfy \s\ < logM + 
log i as explained in Remark 19.81 Hence, we conclude by Lemma 19.41 that 

N e {dB + mC M ) <N c -i e (Qn{(s,9,t) : \s\ < logM + log£}). (9.11) 

This reduces the problem to a Euclidean one: For each 1 < i < 4 the surface 

Qin{(s,e,t) : |s| < log M + log 2} 

is a graph of a function from a domain in IR 2 to R. The variables vary in a range that is of 
bounded length in one direction and of length 2(logM + log£) in the other. As all these 
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functions have derivatives which are uniformly bounded (in fact, all of them are constant 
apart from the function (s,t) 9 min (s) corresponding to <2i, see (I9.2p ). we deduce that 

N c -i e (QD{(s,9,t) : \s\ <logM + log£}) < (9.12) 

Combining ( 19. 12p . ( l9~TTj) . (19. 10p . and ( 19. 9 p gives ( 19. 8p . which as explained above concludes 
the proof of Lemma [8. 101 (121. We turn now to the proof of Lemma I8.10I1 3]). 

Let / : D — > C be K-Lipschitz and denote / = / o r. The support of the product / • ip 
is contained in the intersection of the supports of / and ip. By definition of the ^operator, 
the support of / is contained in B. By definition of ip its support is contained in the 
intersection {x G X : dx(x, dB) > e} fl K M . It follows that 

supp(/ ■ ip) C {x G B : d x (x, dB) > e} n K M . (9.13) 

As the points of discontinuity of / are contained in dB we conclude that / • tp : X — > C 
is continuous. In order to estimate its Lipschitz constant we wish to appeal to Re- 
mark 19.31 Cover the open set B fl /C 2 m by open balls Ui C B fl JC^m- Note that each 
Ui is contained in either B + or B~ . Consider the open cover {[/,} of supp(/ • ip), where 
Ui = n(Ui). By Remark |9~3| Lemma [8.10l (l3|) will follow once we prove that / • <p : U{ — > C 
is max 

{k, \\f\\ 00 }e _1 M-Lipschitz. As <p is e 1 -Lipschitz we see that by Remark 19.21 it is 
enough to argue that for each i, f : Ui — > C is Lipschitz with Lipschitz constant kM. 
As Ui C Kim we know by Lemma 19.61 that the norm of the differential of r is C M 
on U^ It follows that the Lipschitz constant of the composition / = / o r is < kM as 
desired. □ 

Remark 9.9. We remark here about a slight inaccuracy in the arguments presented 
above and how to remedy it: Let M, N be two Riemannian manifolds and / : U — > N a 
smooth map from an open set U C M. Assume the differential of / has norm bounded 
by some constant k on U. We used above (in two places) the conclusion that / must be 
/t-Lipschitz. Strictly speaking, this shows indeed that / is /t-Lipschitz, but with respect 
to the metric induced from the restriction of the Riemannian metric from M to U. This 
need not be the restricted metric on U in which we are interested. In order to remedy 
this, one needs to prove that the following property holds: There exists some absolute 
constant c such that given any two points in x,y G U one is able to find a path connecting 
them inside U of length < cd(x,y) (here d is the metric of the ambient space containing 
U). 

Once this property is established, the conclusion is that / has Lipschitz constant 
k. The above property clearly holds in any Euclidean ball. Using the fact that the 
exponential map from the Lie algebra to G is bi-Lipschitz when restricted to a small 
enough neighborhood of zero, we see that any image of a small enough Euclidean ball 
around zero is an open neighborhood of the identity in G which satisfies the desired 
property. Using left translations (which are isometries of G) we see that each point of G 
has a basis of neighborhoods satisfying the above properties. Regarding the argument in 
the very end of the proof of Lemma 18.101 we should simply define the sets Ui to be such 
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neighborhoods instead open balls. Regarding the use of this in Remark 19.7} we leave the 
details to the reader. 
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